hive的Tez与MR引擎对比

2019-02-17 19:37

6099

hive on tez与 hive on mr的对比。

hive 运行路径 ==> /usr/local/service/hive/bin

[hadoop@10 ~] cd /usr/local/service/hive/bin

[hadoop@10 bin]$ hive --hiveconf hive.execution.engine=tez   #使用tez计算引擎

hive中的‘product_info’已经成功映射了 hbase中的gizwits_product。映射方法见‘emr-hive’中"product-info.sql、product-info-exec.sh"

使用 count() #返回的数据条数为0

hive> select count(*) from product_info;
OK
0
Time taken: 2.687 seconds, Fetched: 1 row(s)

第一次执行

hive> select count(product_key) from product_info;
Query ID = hadoop_20190110144242_585f9dc0-8635-46e7-8739-55f6cdceff46
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1546417190707_0003)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED      1          1        0        0       0       0  
Reducer 2 ...... container     SUCCEEDED      1          1        0        0       0       0  
----------------------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 7.51 s     
----------------------------------------------------------------------------------------------
OK
95243
Time taken: 13.497 seconds, Fetched: 1 r

第二次执行

hive> select count(product_key) from product_info;
Query ID = hadoop_20190110144357_7e5ba026-6006-409d-b829-fa1df7ca2106
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1546417190707_0003)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------
Map 1 .......... container     SUCCEEDED      1          1        0        0       0       0  
Reducer 2 ...... container     SUCCEEDED      1          1        0        0       0       0  
----------------------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 5.73 s     
----------------------------------------------------------------------------------------------
OK
95243
Time taken: 8.109 seconds, Fetched: 1 row(s)

如果是hive on mr

其两次执行的结果如下：

#第一次执行
hive> select count(product_key) from product_info;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20190110144002_26033397-acb6-4745-a896-bd5d56a574a2
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1546417190707_0001, Tracking URL = http://10.8.1.14:5004/proxy/application_1546417190707_0001/
Kill Command = /usr/local/service/hadoop/bin/hadoop job  -kill job_1546417190707_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-01-10 14:40:12,599 Stage-1 map = 0%,  reduce = 0%
2019-01-10 14:40:21,018 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.45 sec
2019-01-10 14:40:27,325 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.28 sec
MapReduce Total cumulative CPU time: 6 seconds 280 msec
Ended Job = job_1546417190707_0001
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 6.28 sec   HDFS Read: 11819 HDFS Write: 105 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 280 msec
OK
95243
Time taken: 26.359 seconds, Fetched: 1 row(s)
#第二次执行
hive> select count(product_key) from product_info;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20190110144033_88b03a83-68d3-4e26-b675-972b0ff540ea
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1546417190707_0002, Tracking URL = http://10.8.1.14:5004/proxy/application_1546417190707_0002/
Kill Command = /usr/local/service/hadoop/bin/hadoop job  -kill job_1546417190707_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-01-10 14:40:41,076 Stage-1 map = 0%,  reduce = 0%
2019-01-10 14:40:48,460 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.12 sec
2019-01-10 14:40:53,669 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.79 sec
MapReduce Total cumulative CPU time: 6 seconds 790 msec
Ended Job = job_1546417190707_0002
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 6.79 sec   HDFS Read: 11819 HDFS Write: 105 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 790 msec
OK
95243
Time taken: 20.815 seconds, Fetched: 1 row(s)

测试结果：

	hive on tez	hive on mr
第一次	13.497 seconds	26.359 seconds
第二次	8.109 seconds	20.815 seconds

结论：hive on tez 查询效率高于hive on mr。

极客嘉园

hive的Tez与MR引擎对比

hive on tez与 hive on mr的对比。

全部评论

极客嘉园

hive的Tez与MR引擎对比

hive on tez与 hive on mr的对比。

相关文章

全部评论