下面图示的结果用hive sql怎么实现?换句话描述就是:hive sql 如何实现分组后拼接同一列的数据呢?
其实,拼接几列字符串并不难,用concat函数就可以实现,concat(col1,col2,col3) 就会将第一列,第二列,第三列字符拼接起来,sql代码:
select concat('a','_','b','_','c') as cct;
sql执行结果:
需要注意的是:concat 函数在连接字符串的时候,只要其中一个是NULL,那么将返回NULL,sql代码:
select concat('a','_','b','_',null) cct;
sql执行结果:
但如果要拼接一列中的数据呢?
第一步:
构建临时表,sql代码:
1.创建测试表create table tmp.tmp1(id int comment 'ID',name string comment '人名');2.插入测试数据insert into tmp.tmp1select 1 as book_name,"韩立" as character_nameunion allselect1 as book_name,"厉飞雨" as character_nameunion allselect1 as book_name,"南宫婉" as character_nameunion allselect1 as book_name,"紫灵仙子" as character_nameunion allselect2 as book_name,"银月" as character_nameunion allselect2 as book_name,"元瑶" as character_nameunion allselect2 as book_name,"董萱儿" as character_nameunion allselect3 as book_name,"墨彩环" as character_nameunion allselect3 as book_name,"冰凤" as character_name;3.查看测试表数据select * from tmp.tmp1;
sql执行结果:
第二步:
实现方式1:
-- collect_set 只能返回不重复的集合select id,concat_ws('/',collect_set(name)) as name from tmp.tmp1 group by id;-- collect_list 返回带重复的集合SELECT id,concat_ws('/',collect_list(name)) as name from tmp.tmp1 group by id;
sql执行结果:
实现方式2:
-- sql有待网友验证,我用的工具不支持group_concat函数,一直报错select id,group_concat(distinct(name),'/') as name from tmp.tmp1 group by id;
sql执行结果: