从品牌网站建设到网络营销策划,从策略到执行的一站式服务
https://issues.apache.org/jira/browse/HIVE-2340
网站建设哪家好,找成都创新互联!专注于网页设计、网站建设、微信开发、小程序开发、集团企业网站建设等服务项目。为回馈新老客户创新互联还提供了湘东免费建站欢迎大家使用!
select userid,count(*) from u_data group by userid order by userid will product MRR.
I think when the result of userid,count(*) is small(one reduce can process the result) . This query plan can optimize to MR ?
To prevent bad reducer merging, the reducer merging only kicks in when the
optimizer thinks it gets a perf boost.
MR -> MRR is not a big win when it comes Tez, due to container-reuse -
going wide on the large cardinality in case of missing map-side
aggregation will be safer.
If hive.map.aggr=true and the userid set fits within memory, then smushing
the reducers would be nicer.
To reset the wide-narrow checks, do
set hive.optimize.reducededuplication.min.reducer=1;
But be aware that it will fail (I1ve seen full disks) as you scale upwards
to the 10+ Tb cases.
Cheers,
Gopal
Default Value: 4
Added In: Hive 0.11.0 with HIVE-2340
Reduce deduplication merges two RSs (reduce sink operators) by moving key/parts/reducer-num of the child RS to parent RS. That means if reducer-num of the child RS is fixed (order by or forced bucketing) and small, it can make very slow, single MR. The optimization will be disabled if number of reducers is less than specified value.
成都网站建设公司地址:成都市青羊区太升南路288号锦天国际A座10层 建设咨询028-86922220
成都快上网科技有限公司-四川网站建设设计公司 | 蜀ICP备19037934号 Copyright 2020,ALL Rights Reserved cdkjz.cn | 成都网站建设 | © Copyright 2020版权所有.
专家团队为您提供成都网站建设,成都网站设计,成都品牌网站设计,成都营销型网站制作等服务,成都建网站就找快上网! | 成都网站建设哪家好? | 网站建设地图