1:postgresql炸裂详解
炸裂函数regexp_split_to_table select regexp_split_to_table('飞机,火车,地铁,汽车', ',' ) 以逗号切分,转为数据集 select regexp_split_to_array('飞机,火车,地铁,汽车', ',' ) 转为数组 select (regexp_split_to_array('飞机,火车,地铁,汽车', ',' ))[1] 取数组的第二个元素 select regexp_split_to_table('F:\QH本部文件\一套表部署相关\test.sh','\\') 正则匹配 array_agg(expression) 把表达式变成一个数组 一般配合 array_to_string() 函数使用 select nameid, array_agg(traffic ) from dbscheme.test0001 group by nameid order by nameid ; 变为数组 string_agg(expression, delimiter) 直接把一个表达式变成字符串 select nameid, string_agg(traffic,',') , update_time from dbscheme.test0001 group by nameid,update_time order by nameid,update_time; //相同id 的连接到一起,逗号分隔 select nameid, array_to_string(array_agg(traffic),',') from dbscheme.test0001 group by nameid order by nameid ; .数组转字符串
2:hive炸裂详解
+----------+----------------------+--+ | a.id | a.tim | +----------+----------------------+--+ | a,b,c,d | 2:00,3:00,4:00,5:00 | | f,b,c,d | 1:10,2:20,3:30,4:40 | +----------+----------------------+--+
explode 炸裂函数,一列变多行。
select id,tim,single_tim from atlasdemo.a lateral view explode(split(tim,',')) t as single_tim +----------+----------------------+-------------+--+ | id | tim | single_tim | +----------+----------------------+-------------+--+ | a,b,c,d | 2:00,3:00,4:00,5:00 | 2:00 | | a,b,c,d | 2:00,3:00,4:00,5:00 | 3:00 | | a,b,c,d | 2:00,3:00,4:00,5:00 | 4:00 | | a,b,c,d | 2:00,3:00,4:00,5:00 | 5:00 | | f,b,c,d | 1:10,2:20,3:30,4:40 | 1:10 | | f,b,c,d | 1:10,2:20,3:30,4:40 | 2:20 | | f,b,c,d | 1:10,2:20,3:30,4:40 | 3:30 | | f,b,c,d | 1:10,2:20,3:30,4:40 | 4:40 | +----------+----------------------+-------------+--+
posexplode炸裂除了会炸开数组/map,还会对应生成索引下标。
select id,tim,single_id_index,single_id from atlasdemo.a lateral view posexplode(split(id,',')) t as single_id_index, single_id; +----------+----------------------+------------------+------------+--+ | id | tim | single_id_index | single_id | +----------+----------------------+------------------+------------+--+ | a,b,c,d | 2:00,3:00,4:00,5:00 | 0 | a | | a,b,c,d | 2:00,3:00,4:00,5:00 | 1 | b | | a,b,c,d | 2:00,3:00,4:00,5:00 | 2 | c | | a,b,c,d | 2:00,3:00,4:00,5:00 | 3 | d | | f,b,c,d | 1:10,2:20,3:30,4:40 | 0 | f | | f,b,c,d | 1:10,2:20,3:30,4:40 | 1 | b | | f,b,c,d | 1:10,2:20,3:30,4:40 | 2 | c | | f,b,c,d | 1:10,2:20,3:30,4:40 | 3 | d | +----------+----------------------+------------------+------------+--+
如果想实现对两列听同事进行多行转换,那么用explode()函数就不能实现了,
但可以用posexplode()函数,因为该函数可以将index和数据都取出来,
使用两次posexplode并令两次取到的index相等就行了。
select id,tim,single_id,single_tim from atlasdemo.a lateral view posexplode(split(id,',')) t as single_id_index, single_id lateral view posexplode(split(tim,',')) t as single_yim_index, single_tim where single_id_index = single_yim_index; +----------+----------------------+------------+-------------+--+ | id | tim | single_id | single_tim | +----------+----------------------+------------+-------------+--+ | a,b,c,d | 2:00,3:00,4:00,5:00 | a | 2:00 | | a,b,c,d | 2:00,3:00,4:00,5:00 | b | 3:00 | | a,b,c,d | 2:00,3:00,4:00,5:00 | c | 4:00 | | a,b,c,d | 2:00,3:00,4:00,5:00 | d | 5:00 | | f,b,c,d | 1:10,2:20,3:30,4:40 | f | 1:10 | | f,b,c,d | 1:10,2:20,3:30,4:40 | b | 2:20 | | f,b,c,d | 1:10,2:20,3:30,4:40 | c | 3:30 | | f,b,c,d | 1:10,2:20,3:30,4:40 | d | 4:40 | +----------+----------------------+------------+-------------+--+