简单建表操作:
建库
======
1. 创建建库test
create table test;
2. 没有test库,就创建库
create table if not exists test;
设置显示
========
1. 设置显示当前库名称
set hive.cli.print.current.db=true;
2. 设置查询时显示表头(select查询时可以显示列名称)
set hive.cli.print.header=true;
表操作
建表
======
1. 建内表t1
create table t1(a int, b string);
2. 没有表t1时才创建t
create table if not exists t1(a int, b string);
3. 创建表,指定字段分隔符为','
create table if not exists t1(a int, b string) row format delimited fields terminated by ',';
4. 创建外表
create external table if not exists t1(a int, b string) row format delimited fields terminated by ',';
5. 创建外表,并指定数据在HDFS上的位置
create external table if not exists t1(a int, b string) row format delimited fields terminated by ',' location '/t1_data';
6. 复制表结构
create table if not exists t2 like t1;
7. 视图
create view v_t1 as select id,name,age from t1;
=====
删除
====
1. 删除视图
1. drop view v_t1; 2. 3. drop view if exists v_t1;
2. 删除表
1. drop table t2; 2. 3. drop table if exists t2;
3. 删除库
drop database if exists test; 这个只能删除为空的库
4. 删除库及库中的表及视图
drop database if exists test cascade;
修改
=====
1. 修改表名称
alter table t1 rename to t2;
2. 改列名,将name改为username,并放在age列后面
alter table t1 change name username string after age;
3. 增加列
alter table t1 add columns(sex char(1));
4. 重新定义列
alter table t1 replace columns(id int, name string, sex char(1),age int);
查看
=====
1. 查看库
1. show databases; 2. 3. show databases like 'de.*';
2. 查看表和视图
1. show tables; 2. 3. show tables like 't.*'; 4. 5. show tables in db like 't.*';
看描述信息
==========
1. 查看库描述信息
desc database db;
2. 查看表描述信息
1. desc t1; 2. 3. desc db.t1;
3. 查看视图描述信息
看详细描述
==========
1. 查看表详细描述
1. desc extended t1; 2. 3. desc extended db.t1;
2. 查看视图详细描述
加载数据
===========
1. HDFS加载
将数据文件复制到表存储的目录下
2. 本地加载
load data local inpath '/opt/person.txt' overwrite into table t1;
3. 从HDFS上加载(数据文件位于HDFS上)
load data inpath '/data/person.txt' overwrite into table t1;
4. 外部表,建表时指定HDFS上数据的目录
查询并导入数据
==============
1. 插入到表中,覆盖原有数据
insert overwrite table t2 select * from t1 where age > 20;
2. 追加到表中
insert into table t2 select * from t1 where age > 20;
查询并导出数据
==============
1.查询结果导出到本地(覆盖)
1. insert overwrite local directory '/opt/person.csv' 2. 3. row format delimited fields terminated by ':' 4. 5. select * from from t1 where age > 20;
2.查询结果导出到本地(追加)
1. insert local directory '/opt/person.csv' 2. 3. row format delimited fields terminated by ':' 4. 5. select * from from t1 where age > 20;
3.查询结果导出到HDFS(覆盖)
1. insert overwrite directory '/data/person.csv' 2. 3. row format delimited fields terminated by ':' 4. 5. select * from from t1 where age > 20;
表连接
=======
1. 内连接
1. select a.*,b.* from a join b on a.id=b.id 2. 3. select a.*,b.* from a inner join b on a.id=b.id
2. 左连接
select a.*,b.* from a left outer join b on a.id=b.id
3. 右连接
select a.*,b.* from a right outer join b on a.id=b.id
4. 外连接
select a.*,b.* from a full outer join b on a.id=b.id
查询条件
========
where
排序
=====
oder by
asc 缺省
desc
函数
======
distinct
max
min
avg
count
sum
limit输出限制
=============
分组查询
========
1. group by
2. 分组函数
3. having 过滤条件