安装Kudu
kudu已经集成在了CDP runtime中,安装比较简单,选择master和tablet之后,配置数据目录即可
安装完成后,我们要在impala中显式集成kudu
为了省去每次建表都需要在TBLPROPERTIES中添加kudumasteraddresses属性,我们还要在Impala的高级配置KuduMaster的地址 --kudu_master_hosts=192.168.0.207:7051
在impala-shell中建kudu表
[root@cdh2 ~]# impala-shell Starting Impala Shell without Kerberos authentication Opened TCP connection to cdh2.macro.com:21000 Connected to cdh2.macro.com:21000 Server version: impalad version 3.4.0-SNAPSHOT RELEASE (build 25402784335c39cc24076d71dab7a3ccbd562094) *********************************************************************************** Welcome to the Impala shell. (Impala Shell v3.4.0-SNAPSHOT (2540278) built on Wed Aug 5 11:07:32 UTC 2020) You can change the Impala daemon that you're connected to by using the CONNECT command.To see how Impala will plan to run your query without actually executing it, use the EXPLAIN command. You can change the level of detail in the EXPLAIN output by setting the EXPLAIN_LEVEL query option. *********************************************************************************** [cdh2.macro.com:21000] default> CREATE TABLE my_first_table > ( > id BIGINT, > name STRING, > PRIMARY KEY(id) > ) > PARTITION BY HASH PARTITIONS 16 > STORED AS KUDU > TBLPROPERTIES ( > 'kudu.master_addresses' = 'cdh2.macro.com:7051' > ); Query: CREATE TABLE my_first_table ( id BIGINT, name STRING, PRIMARY KEY(id) ) PARTITION BY HASH PARTITIONS 16 STORED AS KUDU TBLPROPERTIES ( 'kudu.master_addresses' = 'cdh2.macro.com:7051' ) +-------------------------+ | summary | +-------------------------+ | Table has been created. | +-------------------------+ Fetched 1 row(s) in 2.35s [cdh2.macro.com:21000] default> desc formatted my_first_table; Query: describe formatted my_first_table +------------------------------+------------------------------------------------------------------------------+------------------------------------------------+ | name | type | comment | +------------------------------+------------------------------------------------------------------------------+------------------------------------------------+ | # col_name | data_type | comment | | | NULL | NULL | | id | bigint | NULL | | name | string | NULL | | | NULL | NULL | | # Detailed Table Information | NULL | NULL | | Database: | default | NULL | | OwnerType: | USER | NULL | | Owner: | root | NULL | | CreateTime: | Sat Sep 12 16:50:11 CST 2020 | NULL | | LastAccessTime: | UNKNOWN | NULL | | Retention: | 0 | NULL | | Location: | hdfs://cdh2.macro.com:8020/warehouse/tablespace/external/hive/my_first_table | NULL | | Table Type: | EXTERNAL_TABLE | NULL | | Table Parameters: | NULL | NULL | | | EXTERNAL | TRUE | | | TRANSLATED_TO_EXTERNAL | TRUE | | | external.table.purge | TRUE | | | kudu.master_addresses | cdh2.macro.com:7051 | | | kudu.table_name | impala::default.my_first_table | | | storage_handler | org.apache.hadoop.hive.kudu.KuduStorageHandler | | | transient_lastDdlTime | 1599900611 | | | NULL | NULL | | # Storage Information | NULL | NULL | | SerDe Library: | org.apache.hadoop.hive.kudu.KuduSerDe | NULL | | InputFormat: | org.apache.hadoop.hive.kudu.KuduInputFormat | NULL | | OutputFormat: | org.apache.hadoop.hive.kudu.KuduOutputFormat | NULL | | Compressed: | No | NULL | | Num Buckets: | 0 | NULL | | Bucket Columns: | [] | NULL | | Sort Columns: | [] | NULL | +------------------------------+------------------------------------------------------------------------------+------------------------------------------------+ Fetched 31 row(s) in 0.83s
我们可以看到kudu表创建成功。
遇到的问题
1、启动过程中报错
Check failed: _s.ok() Bad status: Invalid argument: Unable to initialize catalog manager: Failed to initialize sys tables async: on-disk master list
解决办法
停掉master和tserver 删掉之前残余的 /kudu_master/fswal_dir/kudu_master/fsdata_dir/kudu_tablet/fswal_dir/kudu_tablet/fsdata_dir
几个数据目录
2.建表一直卡住,然后报错:
CreateTablet RPC failed for tablet :not authorized: client connection negotiation failed: client connection to 192.168.0.207:7050: FATAL_UNAUTHORIZED: not authorized: unencrypted connections from publicly routable IPs are prohibited.
我们需要在全局gflagfile中配置如下
--rpc_encryption=disabled --rpc_authentication=disabled --rusted_subnets=0.0.0.0/0
3.impala-shell中直接报错
FATAL_UNAUTHORIZED: not authorized: unencrypted connections from publicly routable IPs are prohibited
解决办法:
同样在impala中配置
--rpc_encryption=disabled --rpc_authentication=disabled --rusted_subnets=0.0.0.0/0
4.建表报错
ERROR: ImpalaRuntimeException: Error creating Kudu table 'impala::default.my_first_table' CAUSED BY: NonRecoverableException: not enough live tablet servers to create a table with the requested replication factor 3; 2 tablet servers are alive
这是因为kudu默认的存储副本是3,由于我这是单节点,副本只有1份(这与HDFS不同,HDFS没作此强制限制,1份副本也可以),于是到CM里面修改副本设置,如下图: