MaxCompute用户指南：SQL：INSERT操作_问答-阿里云开发者社区

更新表中的数据（INSERT OVERWRITE/INTO）

命令格式如下：

INSERT OVERWRITE|INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)] [col1,col2 ...]
select_statement
FROM from_statement;

注意：

MaxCompute 的 Insert 语法与通常使用的 MySQL 或 Oracle 的 Insert 语法有差别，在 insert overwrite|into 后需要加入 table 关键字，不是直接使用 tablename。

当 Insert 的目标表是分区表时，指定分区值 [PARTITION (partcol1=val1, partcol2=val2 …)] 语法中不允许使用函数等表达式。

在 MaxCompute SQL 处理数据的过程中，Insert overwrite/into 用于将计算的结果保存目标表中。
Insert into 与 Insert overwrite 的区别是：Insert into 会向表或表的分区中追加数据，而 Insert overwrite 则会在向表或分区中插入数据前清空表中的原有数据。
在使用 MaxCompute 处理数据的过程中，Insert overwrite/into 是最常用到的语句，它们会将计算的结果保存到一个表中，以供下一步计算使用。比如计算 sale_detail 表中不同地区的销售额，操作如下：

create table sale_detail_insert like sale_detail;
alter table sale_detail_insert add partition(sale_date='2013', region='china');
insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')
select shop_name, customer_id, total_price from sale_detail;

注意：
在进行 Insert 更新数据操作时，源表与目标表的对应关系依赖于在 select 子句中列的顺序，而不是表与表之间列名的对应关系，下面的 SQL 语句仍然是合法的：

insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')
select customer_id, shop_name, total_price from sale_detail;
-- 在创建sale_detail_insert表时，列的顺序为：
-- shop_name string, customer_id string, total_price bigint
-- 而从sale_detail向sale_detail_insert插入数据是，sale_detail的插入顺序为：
-- customer_id, shop_name, total_price
-- 此时，会将sale_detail.customer_id的数据插入sale_detail_insert.shop_name
-- 将sale_detail.shop_name的数据插入sale_detail_insert.customer_id

向某个分区插入数据时，分区列不允许出现在 select 列表中：

insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')
select shop_name, customer_id, total_price, sale_date, region from sale_detail;
-- 报错返回，sale_date，region 为分区列，不允许出现在静态分区的 insert 语句中。

同时，partition 的值只能是常量，不可以出现表达式。以下用法是非法的：

insert overwrite table sale_detail_insert partition (sale_date=datepart('2016-09-18 01:10:00', 'yyyy') , region='china')
select shop_name, customer_id, total_price from sale_detail;

多路输出（MULTI INSERT）

MaxCompute SQL 支持在一个语句中插入不同的结果表或者分区。
命令格式如下：

FROM from_statement
INSERT OVERWRITE | INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)]
select_statement1 [FROM from_statement]
[INSERT OVERWRITE | INTO TABLE tablename2 [PARTITION (partcol1=val3, partcol2=val4 ...)]
select_statement2 [FROM from_statement]]

注意：

一般情况下，单个 SQL 里最多可以写 256 路输出，超过 256 路，则报语法错误。

在一个 multi insert 中：
对于分区表，同一个目标分区不允许出现多次。

对于未分区表，该表不能出现多次。

对于同一张分区表的不同分区，不能同时有 Insert overwrite 和 Insert into 操作，否则报错返回。

示例如下：

create table sale_detail_multi like sale_detail;
from sale_detail
insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
select shop_name, customer_id, total_price where .....
insert overwrite table sale_detail_multi partition (sale_date='2011', region='china' )
select shop_name, customer_id, total_price where .....;
-- 成功返回，将 sale_detail 的数据插入到 sales 里的 2010 年及 2011 年中国大区的销售记录中。
from sale_detail
insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
select shop_name, customer_id, total_price
insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
select shop_name, customer_id, total_price;
-- 出错返回，同一分区出现多次。
from sale_detail
insert overwrite table sale_detail_multi partition (sale_date='2010', region='china' )
select shop_name, customer_id, total_price
insert into table sale_detail_multi partition (sale_date='2011', region='china' )
select shop_name, customer_id, total_price;
-- 出错返回，同一张表的不同分区，不能同时有 insert overwrite 和 insert into 操作。

输出到动态分区（DYNAMIC PARTITION）

在 Insert overwrite 到一张分区表时，可以在语句中指定分区的值。也可以用另外一种更加灵活的方式，在分区中指定一个分区列名，但不给出值。相应地，在 select 子句中的对应列来提供分区的值。
命令格式如下：

insert overwrite table tablename partition (partcol1, partcol2 ...) select_statement from from_statement;

注意：

select_statement 字段中，后面的字段将提供目标表动态分区值。如目标表就一级动态分区，则 select_statement 最后一个字段值即为目标表的动态分区值。

目前，在使用动态分区功能的 SQL 中，在分布式环境下，单个进程最多只能输出 512 个动态分区，否则引发运行时异常。

在现阶段，任意动态分区 SQL 不允许生成超过 2000 个动态分区，否则引发运行时异常。

动态生成的分区值不允许为 NULL，也不支持含有特殊字符和中文，否则会引发异常,如：FAILED: ODPS-0123031:Partition exception - invalid dynamic partition value: province=xxx。

如果目标表有多级分区，在运行 Insert 语句时允许指定部分分区为静态，但是静态分区必须是高级分区。

动态分区的示例如下：

create table total_revenues (revenue bigint) partitioned by (region string);
insert overwrite table total_revenues partition(region)
select total_price as revenue, region
from sale_detail;

按照上述写法，在 SQL 运行之前，是不知道会产生哪些分区的，只有在 select 运行结束后，才能由 region 字段产生的值确定会产生哪些分区，这也是为什么叫做 动态分区 的原因。
其他示例如下：

create table sale_detail_dypart like sale_detail;--创建示例目标表

--示例一：
insert overwrite table sale_detail_dypart partition (sale_date, region)
select shop_name,customer_id,total_price,sale_date,region from sale_detail;
-- 成功返回;

此时 sale_detail 表中，sale_date 的值决定目标表的 sale_date 分区值，region 的值决定目标表的 region 分区值。
动态分区中，select_statement 字段和目标表动态分区的对应是按字段顺序决定的。如该示例中，select 语句若写成select shop_name,customer_id,total_price,region,sale_date from sale_detail;则 sale_detail 表中，region 值决定决定目标表的 sale_date 分区值，sale_date 的值决定目标表的 region 分区值。

--示例二：
insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
select shop_name,customer_id,total_price,region from sale_detail;
-- 成功返回，多级分区，指定一级分区

--示例三：
insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
select shop_name,customer_id,total_price from sale_detail;
-- 失败返回，动态分区插入时，动态分区列必须在select列表中

--示例四：
insert overwrite table sales partition (region='china', sale_date)
select shop_name,customer_id,total_price,region from sale_detail;
-- 失败返回，不能仅指定低级子分区，而动态插入高级分区

另外，旧版 MaxCompute 在进行动态分区时，如果分区列的类型与对应 select 列表中列的类型不严格一致，会报错。MaxCompute2.0 则支持隐式类型转换，示例如下：

create table parttable(a int, b double) partitioned by (p string);
insert into parttable partition(p) (p, a) select key, value, current_timestmap() from src;
select * from parttable;

执行上述语句后返回结果如下：

a	b	c
0	NULL	2017-01-23 22:30:47.130406621
0	NULL	2017-01-23 22:30:47.130406621

VALUES

通常在业务测试阶段，需要给一个小数据表准备些基本数据，您可以通过 INSERT … VALUES 的方法快速对测试表写入一些测试数据。
命令格式如下：

INSERT OVERWRITE|INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)][co1name1,colname2...] VALUES (col1_value,col2_value,...)[,(col1_value,col2_value,...),...]

示例一：

drop table if exists srcp;
create table if not exists srcp (key string ,value bigint) partitioned by (p string);
insert into table srcp partition (p='abc') values ('a',1),('b',2),('c',3);

Insert…values 语句执行成功后，查询表 srcp 分区 p=’abc’，结果如下：

+-----+------------+---+
| key | value | p |
+-----+------------+---+
| a | 1 | abc |
| b | 2 | abc |
| c | 3 | abc |
+-----+------------+---+

当表有很多列，而准备数据的时候希望只插入部分列的数据，此时可以用插入列表功能如下。
示例二：

drop table if exists srcp;
create table if not exists srcp (key string ,value bigint) partitioned by (p string);
insert into table srcp partition (p)(key,p) values ('d','20170101'),('e','20170101'),('f','20170101');

Insert…values 语句执行成功后，查询表 srcp 分区 p=’20170101’，结果如下：

+-----+------------+---+
| key | value | p |
+-----+------------+---+
| d | NULL | 20170101 |
| e | NULL | 20170101 |
| f | NULL | 20170101 |
+-----+------------+---+

对于在 values 中没有制定的列，可以看到取缺省值为 NULL。插入列表功能不一定和 values 一起用，对于 Insert into…select…，同样可以使用。
Insert…values 有一个限制：values 必须是常量，但是有时候希望在插入的数据中进行一些简单的运算，此时可以使用 MaxCompute 的 values table 功能，详情见示例三。
示例三：

drop table if exists srcp;
create table if not exists srcp (key string ,value bigint) partitioned by (p string);
insert into table srcp partition (p) select concat(a,b), length(a)+length(b),'20170102' from values ('d',4),('e',5),('f',6) t(a,b);

其中的 values (…), (…) t (a, b)，相当于定义了一个名为 t，列为 a，b的表，类型为（a string，bbigint），其中的类型从 values 列表中推导。这样在不准备任何物理表的时候，可以模拟一个有任意数据的，多行的表，并进行任意运算。
Insert…values 语句执行成功后，查询表 srcp 分区 p=’20170102’，结果如下：

+-----+------------+---+
| key | value | p |
+-----+------------+---+
| d4 | 2 | 20170102 |
| e5 | 2 | 20170102 |
| f6 | 2 | 20170102 |
+-----+------------+---+

实际上，values 表并不限于在 Insert 语句中使用，任何 DML 语句都可以使用。
还有一种 values 表的特殊形式，如下所示：

select abs(-1), length('abc'), getdate();

如上述语句所示，可以不写 from 语句，直接执行 select，只要 select的表达式列表不用任何上游表数据就可以。其底层实现为从一个 1 行，0 列的匿名 values 表选取。这样，在希望测试一些函数，比如自己的UDF 等时，便不用再手工创建 DUAL 表。

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

MaxCompute用户指南：SQL：INSERT操作

相关文章