前言
Hive 不存储数据,是表到HDFS文件的映射关系。在HQL开发中,我们主要关注语法,今天就带着小伙伴们来了解一下每个 DDL 语句的语义。
1. 数据库
1.1 查询所有数据库
SHOW DATABASES;
1.2 创建库
CREATE [REMOTE] (DATABASE|SCHEMA) [IF NOT EXISTS] database_name [COMMENT database_comment] [LOCATION hdfs_path] [MANAGEDLOCATION hdfs_path] [WITH DBPROPERTIES (property_name=property_value, ...)];
- REMOTE 也是在Hive 4.0.0 版本中添加的,主要是为了与其他系统建立数据连接使用的。
- DATABASE 和 SCHEMA可以互换,语义一样。
- IF NOT EXISTS:表示如果库名为database_name的数据库不存在的情况下创建;如果存在,就不再创建。
- MANAGEDLOCATION是在 Hive 4.0.0 版本中添加的。LOCATION现在指的是外部表的默认目录,MANAGEDLOCATION指的是内部表的默认路径。建议MANAGEDLOCATION位于
metastore.warehouse.dir
中,这样所有被管理的表在同一个根目录下,便于使用统一管理策略。另外,还可以与metastore.warehouse.tenant.colocation
一起使用,使其指向仓库根目录之外的目录,以具有基于租户的公共根目录,在该目录中可以设置配额和其他策略。 - WITH DBPROPERTIES:用来添加配置项
注意:[] 包裹的语句是可选项,()包裹的语句是必选项!
1.3 修改库
- 更新配置项:
ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, ...);
- 更新 Owner:
ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role;
- 更新 Location
ALTER (DATABASE|SCHEMA) database_name SET LOCATION fs_path;
1.4 删除库
DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];
- RESTRICT是默认值,如果数据库不为null,会删除失败。
1.5 use库
进入database_name数据库:
USE database_name;
2. 表
2.1 查看所有表
SHOW TABLES;
2.2 创建表
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later) [(col_name data_type [column_constraint_specification] [COMMENT col_comment], ... [constraint_specification])] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] [SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)] ON ((col_value, col_value, ...), (col_value, col_value, ...), ...) [STORED AS DIRECTORIES] [ [ROW FORMAT row_format] [STORED AS file_format] | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later) ] [LOCATION hdfs_path] [TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later) [AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables) CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name LIKE existing_table_or_view_name [LOCATION hdfs_path]; data_type : primitive_type | array_type | map_type | struct_type | union_type -- (Note: Available in Hive 0.7.0 and later) primitive_type : TINYINT | SMALLINT | INT | BIGINT | BOOLEAN | FLOAT | DOUBLE | DOUBLE PRECISION -- (Note: Available in Hive 2.2.0 and later) | STRING | BINARY -- (Note: Available in Hive 0.8.0 and later) | TIMESTAMP -- (Note: Available in Hive 0.8.0 and later) | DECIMAL -- (Note: Available in Hive 0.11.0 and later) | DECIMAL(precision, scale) -- (Note: Available in Hive 0.13.0 and later) | DATE -- (Note: Available in Hive 0.12.0 and later) | VARCHAR -- (Note: Available in Hive 0.12.0 and later) | CHAR -- (Note: Available in Hive 0.13.0 and later) array_type : ARRAY < data_type > map_type : MAP < primitive_type, data_type > struct_type : STRUCT < col_name : data_type [COMMENT col_comment], ...> union_type : UNIONTYPE < data_type, data_type, ... > -- (Note: Available in Hive 0.7.0 and later) row_format : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char] [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char] [NULL DEFINED AS char] -- (Note: Available in Hive 0.13 and later) | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)] file_format: : SEQUENCEFILE | TEXTFILE -- (Default, depending on hive.default.fileformat configuration) | RCFILE -- (Note: Available in Hive 0.6.0 and later) | ORC -- (Note: Available in Hive 0.11.0 and later) | PARQUET -- (Note: Available in Hive 0.13.0 and later) | AVRO -- (Note: Available in Hive 0.14.0 and later) | JSONFILE -- (Note: Available in Hive 4.0.0 and later) | INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname column_constraint_specification: : [ PRIMARY KEY|UNIQUE|NOT NULL|DEFAULT [default_value]|CHECK [check_expression] ENABLE|DISABLE NOVALIDATE RELY/NORELY ] default_value: : [ LITERAL|CURRENT_USER()|CURRENT_DATE()|CURRENT_TIMESTAMP()|NULL ] constraint_specification: : [, PRIMARY KEY (col_name, ...) DISABLE NOVALIDATE RELY/NORELY ] [, PRIMARY KEY (col_name, ...) DISABLE NOVALIDATE RELY/NORELY ] [, CONSTRAINT constraint_name FOREIGN KEY (col_name, ...) REFERENCES table_name(col_name, ...) DISABLE NOVALIDATE [, CONSTRAINT constraint_name UNIQUE (col_name, ...) DISABLE NOVALIDATE RELY/NORELY ] [, CONSTRAINT constraint_name CHECK [check_expression] ENABLE|DISABLE NOVALIDATE RELY/NORELY ]
注意:由于创建表涉及的内容比较多,会单独拿出来另写一篇文章详细介绍的!
2.3 修改表
- 重命名:
ALTER TABLE table_name RENAME TO new_table_name;
- 更新配置项:
ALTER TABLE table_name SET TBLPROPERTIES (property_name = property_value, property_name = property_value, ... );
- 更新 Location:
ALTER TABLE table_name [PARTITION partition_spec] SET LOCATION fs_path;
注意:当添加PARTITION partition_spec时,该操作将应用于相应的分区,而不是表(下同)。
- 更新文件格式:
ALTER TABLE table_name [PARTITION partition_spec] SET FILEFORMAT file_format;
- 更新 SerDe 配置项:
ALTER TABLE table_name [PARTITION partition_spec] SET SERDE serde_class_name [WITH SERDEPROPERTIES serde_properties]; ALTER TABLE table_name [PARTITION partition_spec] SET SERDEPROPERTIES serde_properties; serde_properties: : (property_name = property_value, property_name = property_value, ... )
- 添加分区:
ALTER TABLE table_name ADD [IF NOT EXISTS] (PARTITION partition_spec [LOCATION fs_path])+;
- 删除分区:
ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec[, PARTITION partition_spec, ...];
- 添加/替换列:
ALTER TABLE table_name ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...) [CASCADE|RESTRICT]
- 修改列:
ALTER TABLE table_name CHANGE [COLUMN] col_old_name col_new_name column_type [COMMENT col_comment] [FIRST|AFTER column_name] [CASCADE|RESTRICT];
2.4 删除表
DROP TABLE [IF EXISTS] table_name;
3. 视图
3.1 创建视图
CREATE VIEW [IF NOT EXISTS] view_name [(column_name, ...) ] [COMMENT view_comment] [TBLPROPERTIES (property_name = property_value, ...)] AS SELECT ...;
3.2 修改视图
- 重命名:
ALTER VIEW view_name RENAME TO new_view_name;
- 更新配置项:
ALTER VIEW view_name SET TBLPROPERTIES (property_name = property_value, ... );
- 更新查询语句:
ALTER VIEW view_name AS select_statement;
3.3 删除视图
DROP VIEW [IF EXISTS] view_name;
4. 函数
4.1 查看所有函数
SHOW FUNCTIONS;
4.2 创建函数
CREATE FUNCTION function_name AS class_name;
4.3 删除函数
DROP FUNCTION [IF EXISTS] function_name;