1.数据格式(hdfs数据) \t
id+"\t"+name hdfs://ns1/user/traffic_etl/dt=2019-10-15/1.lzo hdfs://ns1/user/traffic_etl/dt=2019-10-16/2.lzo
2.创建表
CREATE EXTERNAL TABLE `app.app_ord_etl_bak`( `id` string, `orderid` string, `skuid` string, `parentid` string, `orderstatus` string, `ordertype` string, `yn` string, `createdat` string, `createattime` string, `updatedat` string, `cancelat` string, `chuguanat` string, `payedat` string, `writeat` string, `splitat` string, `chuguan` string, `payed` string, `split` string, `paymode` string, `idpaymenttype` string, `cont` string) partitioned by(dt string) row format delimited fields terminated by '\t' STORED AS TEXTFILE location 'hdfs://ns1/user/jrdw/rbdm.db/order-table-bak';
3.加载数据
alter table fdm_traffic_etl add partition (dt='2019-10-15');
后续会自动加载
分隔符一定要用 \t 不然会非常危险
其他分隔符
private static final char DORIS_FIELD_SPLIT = (char)
id+DORIS_FIELD_SPLIT+name
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001'
修复分区
MSCK REPAIR TABLE table_name;
建表、内部表和外部表的区别:
https://www.cnblogs.com/EnzoDin/p/6951181.html
hadoop命令
https://www.cnblogs.com/ldsweely/p/9428459.html