五、产生连续数值
输出结果如下所示:
1 2 3 4 5 ... 100
参考答案:
不借助其他任何外表,实现产生连续数值
此处给出两种解法,其一:
select id_start+pos as id from( select 1 as id_start, 1000000 as id_end ) m lateral view posexplode(split(space(id_end-id_start), '')) t as pos, val
其二:
select row_number() over() as id from (select split(space(99), ' ') as x) t lateral view explode(x) ex;
那如何产生1至1000000连续数值?
参考答案:
select row_number() over() as id from (select split(space(999999), ' ') as x) t lateral view explode(x) ex;
六、数据扩充与收缩
表名:t6
表字段及内容:
a 3 2 4
问题一:数据扩充
输出结果如下所示:
a b 3 3、2、1 2 2、1 4 4、3、2、1
参考答案:
select t.a, concat_ws('、',collect_set(cast(t.rn as string))) as b from ( select t6.a, b.rn from t6 left join ( select row_number() over() as rn from (select split(space(5), ' ') as x) t -- space(5)可根据t6表的最大值灵活调整 lateral view explode(x) pe ) b on 1 = 1 where t6.a >= b.rn order by t6.a, b.rn desc ) t group by t.a;
问题二:数据扩充,排除偶数
输出结果如下所示:
a b 3 3、1 2 1 4 3、1
参考答案:
select t.a, concat_ws('、',collect_set(cast(t.rn as string))) as b from ( select t6.a, b.rn from t6 left join ( select row_number() over() as rn from (select split(space(5), ' ') as x) t lateral view explode(x) pe ) b on 1 = 1 where t6.a >= b.rn and b.rn % 2 = 1 order by t6.a, b.rn desc ) t group by t.a;
问题三:如何处理字符串累计拼接
问题描述:将小于等于a字段的值聚合拼接起来
输出结果如下所示:
a b 3 2、3 2 2 4 2、3、4
参考答案:
select t.a, concat_ws('、',collect_set(cast(t.a1 as string))) as b from ( select t6.a, b.a1 from t6 left join ( select a as a1 from t6 ) b on 1 = 1 where t6.a >= b.a1 order by t6.a, b.a1 ) t group by t.a;
问题四:如果a字段有重复,如何实现字符串累计拼接
输出结果如下所示:
a b 2 2 3 2、3 3 2、3、3 4 2、3、3、4
参考答案:
select a, b from ( select t.a, t.rn, concat_ws('、',collect_list(cast(t.a1 as string))) as b from ( select a.a, a.rn, b.a1 from ( select a, row_number() over(order by a ) as rn from t6 ) a left join ( select a as a1, row_number() over(order by a ) as rn from t6 ) b on 1 = 1 where a.a >= b.a1 and a.rn >= b.rn order by a.a, b.a1 ) t group by t.a,t.rn order by t.a,t.rn ) tt;
问题五:数据展开
问题描述:如何将字符串"1-5,16,11-13,9"扩展成"1,2,3,4,5,16,11,12,13,9"?注意顺序不变。
参考答案:
select concat_ws(',',collect_list(cast(rn as string))) from ( select a.rn, b.num, b.pos from ( select row_number() over() as rn from (select split(space(20), ' ') as x) t -- space(20)可灵活调整 lateral view explode(x) pe ) a lateral view outer posexplode(split('1-5,16,11-13,9', ',')) b as pos, num where a.rn between cast(split(num, '-')[0] as int) and cast(split(num, '-')[1] as int) or a.rn = num order by pos, rn ) t;
七、合并与拆分
表名:t7
表字段及内容:
a b 2014 A 2014 B 2015 B 2015 D
问题一:合并
输出结果如下所示:
2014 A、B 2015 B、D
参考答案:
select a, concat_ws('、', collect_set(t.b)) b from t7 group by a;
问题二:拆分
问题描述:将分组合并的结果拆分出来
参考答案:
select t.a, d from ( select a, concat_ws('、', collect_set(t7.b)) b from t7 group by a )t lateral view explode(split(t.b, '、')) table_tmp as d;
八、模拟循环操作
表名:t8
表字段及内容:
a 1011 0101
问题一:如何将字符'1'的位置提取出来
输出结果如下所示:
1,3,4 2,4
参考答案:
select a, concat_ws(",",collect_list(cast(index as string))) as res from ( select a, index+1 as index, chr from ( select a, concat_ws(",",substr(a,1,1),substr(a,2,1),substr(a,3,1),substr(a,-1)) str from t8 ) tmp1 lateral view posexplode(split(str,",")) t as index,chr where chr = "1" ) tmp2 group by a;
九、不使用distinct或group by去重
表名:t9
表字段及内容:
a b c d 2014 2016 2014 A 2014 2015 2015 B
问题一:不使用distinct或group by去重
输出结果如下所示:
2014 A 2016 A 2014 B 2015 B
参考答案:
select t2.year ,t2.num from ( select * ,row_number() over (partition by t1.year,t1.num) as rank_1 from ( select a as year, d as num from t9 union all select b as year, d as num from t9 union all select c as year, d as num from t9 )t1 )t2 where rank_1=1 order by num;