Mysql 行列变换《你想要的都有》

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,高可用系列 2核4GB
云数据库 RDS PostgreSQL,高可用系列 2核4GB
简介: Mysql 行列变换《你想要的都有》

介绍

在面试和实际问题中,常常会遇到各种行列转换,例如数据根据分隔符一行变多行、数据一列变多列和多列变一列,下面来给大家介绍在Mysql中如何处理这些场景。

1.准备测试数据

create table dwd.fact_suject_data
(
    student_id    int          null comment '编号',
    subject_level varchar(255) null comment '科目等级',
    subject_level_json varchar(255) null comment '科目等级json数据'
);
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (12,'China E,English D,Math E','{"China": "E","English": "D","Math": "E"}');
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (2,'China B,English B','{"China": "B","English": "B"}');
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (3,'English A,Math C','{"English": "A","Math": "C"}');
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (4,'China C,Math A','{"China": "C","Math": "A"}');
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (5,'China D,English A,Math C','{"China": "D","English": "A","Math": "C"}');
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (6,'China C,English A,Math D','{"China": "C","English": "A","Math": "D"}');
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (7,'China A,English E,Math B','{"China": "A","English": "E","Math": "B"}');
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (8,'China D,English E,Math E','{"China": "D","English": "E","Math": "E"}');
insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (9,'China C,English E,Math C','{"China": "C","English": "E","Math": "C"}');


2.根据分隔符一行变多行

需求:我想要学生编号和科目等级两列,科目等级只展示单个科目等级。

方法一:使用mysql自带的表,用于切分字符串,注意help_topic_id是从0开始的

select t1.student_id,
       -- 根据 help_topic_id 对 subject_level 进行切分字符串
       substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1) subject_level
from dwd.fact_suject_data t1
         left join mysql.help_topic t2
                   on t2.help_topic_id <= length(t1.subject_level) - length(replace(t1.subject_level, ',', ''))
;

方法二:使用递归自己构造出自增id列(从1开始也可从0开始),用于切分字符串

with recursive cte(n) as
                   (select 1
                    union all
                    select n + 1
                    from cte
                    where n < 100) -- 递归出需要的自增id表
select t1.student_id,
       -- 根据 n 对 subject_level 进行切分字符串
       substring_index(substring_index(t1.subject_level, ',', t2.n), ',', -1) subject_level
from dwd.fact_suject_data t1
         left join cte t2
                   on t2.n <= (length(t1.subject_level) - length(replace(t1.subject_level, ',', '')) + 1)
;

两个方法的脚本,都是根据分隔符的总数量,依次构造出递增到总数量的数字,用于做substring_index()参数进行切割,除了以上两种方法,如果有其他表有这样递增且唯一的id,也可以进行借用。

3.一列变多列,简单分列

需求:我想要学生编号、科目和等级三列;

根据步骤2的结果,简单使用substring_index()切分就可以了;

select t1.student_id,
       -- 根据 help_topic_id 对 subject_level 进行切分字符串
       substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1) subject_level,
       substring_index(substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1),' ',1) subject,
       substring_index(substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1),' ',-1) level
from dwd.fact_suject_data t1
         left join mysql.help_topic t2
                   on t2.help_topic_id <= length(t1.subject_level) - length(replace(t1.subject_level, ',', ''))
;

需求:我想要学生编号和每科对应的数据,需要使用subject_level_json字段;


subject_level_json是常见的json数据类型,那么我们可以使用有关json的函数,下面介绍常用的几个,如果想要了解更多可点击Mysql官网查看和扩展;

select json_object('key1', 1, 'key2','abc') json_data,
       json_extract('{"key1": 1, "key2": "abc"}','$.key1') key1,
       json_extract('{"key1": 1, "key2": "abc"}','$.key2') key2;

接下来直接引用,遇到数据类型不对的可以cast()转换,值多双引号的可replace()替换;

select student_id,
       subject_level_json,
       replace(cast(json_extract(cast(subject_level_json as json),'$.China') as char),'"','') china,
       replace(cast(json_extract(cast(subject_level_json as json),'$.English') as char),'"','') english,
       replace(cast(json_extract(cast(subject_level_json as json),'$.Math') as char),'"','') math
from fact_suject_dat;

4.一列变多列,数据透视

需求:我想要学生编号和每科对应的数据,不能使用使用subject_level_json字段;

根据步骤2分列后,进一步使用case when 就可直接转换了

select student_id,
       max(case when subject = 'China' then level else null end) china,
       max(case when subject = 'English' then level else null end) english,
       max(case when subject = 'Math' then level else null end) math
from (select t1.student_id,
       -- 根据 help_topic_id 对 subject_level 进行切分字符串
       substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1) subject_level,
       substring_index(substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1),' ',1) subject,
       substring_index(substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1),' ',-1) level
from dwd.fact_suject_data t1
         left join mysql.help_topic t2
                   on t2.help_topic_id <= length(t1.subject_level) - length(replace(t1.subject_level, ',', ''))) t
group by student_id
;

也可以subject_level字段,结合mid()或者substring()和case when进一步转换,如果对字符串函数不了解,点这里

select student_id,
       max(case when subject_level like '%China%'
           then mid(subject_level ,(instr(subject_level,'China')+length('China')+1), 1) else null end)  china,
       max(case when subject_level like '%English%'
            then mid(subject_level,(instr(subject_level,'English')+length('English')+1), 1) else null end) english,
       max(case when subject_level like '%Math%'
            then mid(subject_level ,(instr(subject_level,'Math')+length('Math')+1), 1) else null end) math
from fact_suject_data
group by student_i;

5.多列变一列

需求:我想要根据步骤4的结果,生成学生编号、科目和等级三列数据;

使用union all进行拼接就可以,如果是oracle的话可以用PIOVT功能。

select student_id,
       concat('China ',china) subject_level
from
(select student_id,
       max(case when subject_level like '%China%'
           then mid(subject_level ,(instr(subject_level,'China')+length('China')+1), 1) else null end)  china,
       max(case when subject_level like '%English%'
            then mid(subject_level,(instr(subject_level,'English')+length('English')+1), 1) else null end) english,
       max(case when subject_level like '%Math%'
            then mid(subject_level ,(instr(subject_level,'Math')+length('Math')+1), 1) else null end) math
from fact_suject_data
group by student_id) t
where china is not null
union all
select student_id,
       concat('English ',english) subject_level
from
(select student_id,
       max(case when subject_level like '%China%'
           then mid(subject_level ,(instr(subject_level,'China')+length('China')+1), 1) else null end)  china,
       max(case when subject_level like '%English%'
            then mid(subject_level,(instr(subject_level,'English')+length('English')+1), 1) else null end) english,
       max(case when subject_level like '%Math%'
            then mid(subject_level ,(instr(subject_level,'Math')+length('Math')+1), 1) else null end) math
from fact_suject_data
group by student_id) t
where english is not null
union all
select student_id,
       concat('Math ',math) subject_level
from
(select student_id,
       max(case when subject_level like '%China%'
           then mid(subject_level ,(instr(subject_level,'China')+length('China')+1), 1) else null end)  china,
       max(case when subject_level like '%English%'
            then mid(subject_level,(instr(subject_level,'English')+length('English')+1), 1) else null end) english,
       max(case when subject_level like '%Math%'
            then mid(subject_level ,(instr(subject_level,'Math')+length('Math')+1), 1) else null end) math
from fact_suject_data
group by student_id) t
where math is not null;

需要:我想要……


相关实践学习
每个IT人都想学的“Web应用上云经典架构”实战
本实验从Web应用上云这个最基本的、最普遍的需求出发,帮助IT从业者们通过“阿里云Web应用上云解决方案”,了解一个企业级Web应用上云的常见架构,了解如何构建一个高可用、可扩展的企业级应用架构。
MySQL数据库入门学习
本课程通过最流行的开源数据库MySQL带你了解数据库的世界。 &nbsp; 相关的阿里云产品:云数据库RDS MySQL 版 阿里云关系型数据库RDS(Relational Database Service)是一种稳定可靠、可弹性伸缩的在线数据库服务,提供容灾、备份、恢复、迁移等方面的全套解决方案,彻底解决数据库运维的烦恼。 了解产品详情:&nbsp;https://www.aliyun.com/product/rds/mysql&nbsp;
相关文章
|
存储 关系型数据库 MySQL
|
存储 SQL Oracle
【SQL应知应会】行列转换(二)• MySQL版
【SQL应知应会】行列转换(二)• MySQL版
405 0
|
存储 Oracle 关系型数据库
行列转换(一)• MySQL版
行列转换(一)• MySQL版
179 0
|
SQL 关系型数据库 MySQL
mysql 行列互转
即将原本同一列下多行的不同内容作为多个字段,输出对应内容。
|
关系型数据库 MySQL Java
mysql字段的细节(查询自定义的字段[意义:行列转置];UNION ALL;case-when)
mysql字段的细节(查询自定义的字段[意义:行列转置];UNION ALL;case-when)
256 0
mysql字段的细节(查询自定义的字段[意义:行列转置];UNION ALL;case-when)
|
SQL 关系型数据库 MySQL
|
关系型数据库 MySQL 数据库
|
关系型数据库 MySQL
关于MYSQL的行列转换
同事今天中午去参加一个面试,有一道面试题比较基础 把 a        b        c 1        1        1.1 1        2        1.2 1        3        1.3 2        1        2.1 2        2        2.2 2        3        2.3 变成
1065 0
|
SQL MySQL 关系型数据库
MySQL中行列转换的SQL技巧
行列转换常见场景 由于很多业务表因为历史原因或者性能原因,都使用了违反第一范式的设计模式。
1779 0
|
19天前
|
缓存 关系型数据库 BI
使用MYSQL Report分析数据库性能(下)
使用MYSQL Report分析数据库性能
56 3

推荐镜像

更多