hive metastore配置kerberos认证

简介: hive从3.0.0开始提供hive metastore单独服务作为像presto、flink、spark等组件的元数据中心。但是默认情况下hive metastore在启动之后是不需要进行认证就可以访问的。所以本文基于大数据组件中流行的kerberos认证方式,对hive metastore进行认证配置。

如果您还不了解如何单独启用hive metastore服务,那么您可以参考下述文章。

Presto使用Docker独立运行Hive Standalone Metastore管理MinIO(S3)

kdc安装

已知安装kdc的主机的hostname为:hadoop

yum install -y krb5-server krb5-libs krb5-auth-dialog krb5-workstation

修改配置文件

修改/var/kerberos/krb5kdc/kdc.conf,默认内容为

[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88

[realms]
 EXAMPLE.COM = {
  #master_key_type = aes256-cts
  acl_file = /var/kerberos/krb5kdc/kadm5.acl
  dict_file = /usr/share/dict/words
  admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
  supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
 }

可修改EXAMPLE.COM为您自己设定的域,例如本文将此设置为BIGDATATOAI.COM

[kdcdefaults]
 kdc_ports = 88
 kdc_tcp_ports = 88

[realms]
 BIGDATATOAI.COM = {
  #master_key_type = aes256-cts
  acl_file = /var/kerberos/krb5kdc/kadm5.acl
  dict_file = /usr/share/dict/words
  admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
  supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
 }

修改/etc/krb5.conf,默认文件为

# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 dns_lookup_realm = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = true
 rdns = false
 pkinit_anchors = FILE:/etc/pki/tls/certs/ca-bundle.crt
# default_realm = EXAMPLE.COM
 default_ccache_name = KEYRING:persistent:%{uid}

[realms]
# EXAMPLE.COM = {
#  kdc = kerberos.example.com
#  admin_server = kerberos.example.com
# }

[domain_realm]
# .example.com = EXAMPLE.COM
# example.com = EXAMPLE.COM

修改为如下所示,其中,将域设置为BIGDATATOAI.COM,kdc和admin_server设置为hadoop

# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 dns_lookup_realm = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = true
 rdns = false
 pkinit_anchors = FILE:/etc/pki/tls/certs/ca-bundle.crt
 default_realm = BIGDATATOAI.COM
 default_ccache_name = KEYRING:persistent:%{uid}

[realms]
BIGDATATOAI.COM = {
  kdc = hadoop
  admin_server = hadoop
}

[domain_realm]

初始化kerberos数据库

kdb5_util create -s -r BIGDATATOAI.COM

初始化过程中会要求重复输入kdc数据库的master key,请输入该master key。

[root@hadoop data]# kdb5_util create -s -r BIGDATATOAI.COM
Loading random data
Initializing database '/var/kerberos/krb5kdc/principal' for realm 'BIGDATATOAI.COM',
master key name 'K/M@BIGDATATOAI.COM'
You will be prompted for the database Master Password.
It is important that you NOT FORGET this password.
Enter KDC database master key: 
Re-enter KDC database master key to verify: 

添加管理员用户

kadmin.local

在添加过程中会要求重复输入用户的密码,请输入该密码两次即可。

[root@hadoop data]# kadmin.local 
Authenticating as principal root/admin@BIGDATATOAI.COM with password.
kadmin.local:  addprinc admin/admin@BIGDATATOAI.COM
WARNING: no policy specified for admin/admin@BIGDATATOAI.COM; defaulting to no policy
Enter password for principal "admin/admin@BIGDATATOAI.COM": 
Re-enter password for principal "admin/admin@BIGDATATOAI.COM": 
Principal "admin/admin@BIGDATATOAI.COM" created.

修改/var/kerberos/krb5kdc/kadm5.acl,设置为

*/admin@BIGDATATOAI.COM *

启动相关服务

systemctl start krb5kdc
systemctl start kadmin

使用管理员用户添加principal

kadmin -p admin/admin

进入kadmin客户端之后,添加hive-metastore/hadoop@BIGDATATOAI.COM这个principal。在添加过程中会要求重复输入用户的密码,请输入该密码两次即可。

[root@hadoop data]# kadmin -p admin/admin
Authenticating as principal admin/admin with password.
Password for admin/admin@BIGDATATOAI.COM: 
kadmin:  add_principal hive-metastore/hadoop
WARNING: no policy specified for hive-metastore/hadoop@BIGDATATOAI.COM; defaulting to no policy
Enter password for principal "hive-metastore/hadoop@BIGDATATOAI.COM": 
Re-enter password for principal "hive-metastore/hadoop@BIGDATATOAI.COM": 
Principal "hive-metastore/hadoop@BIGDATATOAI.COM" created.

导出principal

kadmin:  xst -t /root/hive-metastore.keytab -norandkey hive-metastore/hadoop
kadmin: Principal -t does not exist.
kadmin: Principal /root/hive-metastore.keytab does not exist.
kadmin: Principal -norandkey does not exist.
Entry for principal hive-metastore/hadoop with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
Entry for principal hive-metastore/hadoop with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab FILE:/etc/krb5.keytab.
Entry for principal hive-metastore/hadoop with kvno 2, encryption type des3-cbc-sha1 added to keytab FILE:/etc/krb5.keytab.
Entry for principal hive-metastore/hadoop with kvno 2, encryption type arcfour-hmac added to keytab FILE:/etc/krb5.keytab.
Entry for principal hive-metastore/hadoop with kvno 2, encryption type camellia256-cts-cmac added to keytab FILE:/etc/krb5.keytab.
Entry for principal hive-metastore/hadoop with kvno 2, encryption type camellia128-cts-cmac added to keytab FILE:/etc/krb5.keytab.
Entry for principal hive-metastore/hadoop with kvno 2, encryption type des-hmac-sha1 added to keytab FILE:/etc/krb5.keytab.
Entry for principal hive-metastore/hadoop with kvno 2, encryption type des-cbc-md5 added to keytab FILE:/etc/krb5.keytab.

hive metastore配置kerberos认证

修改metastore-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://192.168.1.3:3306/metastore_2?useSSL=false&serverTimezone=UTC</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>password</value>
    </property>
    <property>
        <name>hive.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>
    <property>
        <name>metastore.thrift.uris</name>
        <value>thrift://localhost:9083</value>
        <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
    </property>
    <property>
        <name>metastore.task.threads.always</name>
        <value>org.apache.hadoop.hive.metastore.events.EventCleanerTask</value>
    </property>
    <property>
        <name>metastore.expression.proxy</name>
        <value>org.apache.hadoop.hive.metastore.DefaultPartitionExpressionProxy</value>
    </property>
    <property>
        <name>metastore.warehouse.dir</name>
        <value>files:///user/hive/warehouse</value>
    </property>
    <property>
        <name>hive.metastore.authentication.type</name>
        <value>kerberos</value>
    </property>
    <property>
        <name>hive.metastore.thrift.impersonation.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.metastore.kerberos.principal</name>
        <value>hive-metastore/hadoop@BIGDATATOAI.COM</value>
    </property>
    <property>
        <name>hive.metastore.sasl.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.metastore.kerberos.keytab.file</name>
        <value>/etc/hive/conf/hive-metastore.keytab</value>
    </property>
</configuration>

由于hive-metastore的kerberos服务依赖于hdfs组件,所以还需要在core-site.xml中新增如下配置:

  <property>
    <name>hadoop.proxyuser.hive-metastore.groups</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.hive-metastore.hosts</name>
    <value>*</value>
  </property>

<property>
  <name>hadoop.security.authorization</name>
  <value>true</value>
</property>

<property>
  <name>hadoop.security.auth_to_local</name>
  <value>
  RULE:[2:$1@$0](hive-metastore/.*@.*BIGDATATOAI.COM)s/.*/hive-metastore/
  DEFAULT
  </value>
</property>

<property>
  <name>hadoop.security.authentication</name>
  <value>kerberos</value>
</property>

接下来便可以启动hive metastore

bin/start-metastore

此时直接通过Java API对该HIve Metastore进行访问,如何通过Java API对HIve Metastore进行访问可参考:通过Java API获取Hive Metastore中的元数据信息

package com.zh.ch.bigdata.hms;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hive.metastore.IMetaStoreClient;
import org.apache.hadoop.hive.metastore.RetryingMetaStoreClient;
import org.apache.hadoop.hive.metastore.api.MetaException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;


public class HMSClient {

    public static final Logger LOGGER = LoggerFactory.getLogger(HMSClient.class);

    /**
     * 初始化HMS连接
     * @param conf org.apache.hadoop.conf.Configuration
     * @return IMetaStoreClient
     * @throws MetaException 异常
     */
    public static IMetaStoreClient init(Configuration conf) throws MetaException {
        try {
            return RetryingMetaStoreClient.getProxy(conf, false);
        } catch (MetaException e) {
            LOGGER.error("hms连接失败", e);
            throw e;
        }
    }

    public static void main(String[] args) throws Exception {

        Configuration conf = new Configuration();
        conf.set("hive.metastore.uris", "thrift://192.168.241.134:9083");

        // conf.addResource("hive-site.xml");
        IMetaStoreClient client = HMSClient.init(conf);

        System.out.println("----------------------------获取所有catalogs-------------------------------------");
        client.getCatalogs().forEach(System.out::println);

        System.out.println("------------------------获取catalog为hive的描述信息--------------------------------");
        System.out.println(client.getCatalog("hive").toString());

        System.out.println("--------------------获取catalog为hive的所有database-------------------------------");
        client.getAllDatabases("hive").forEach(System.out::println);
    }
}

得到结果

可见如果不通过kerberos认证的话,是无法访问hive metastore的。

相关文章
|
6月前
|
SQL 数据库 HIVE
记录hive数据库远程访问配置问题
记录hive数据库远程访问配置问题
154 0
|
1月前
|
SQL 存储 分布式计算
Hadoop-16-Hive HiveServer2 HS2 允许客户端远程执行HiveHQL HCatalog 集群规划 实机配置运行
Hadoop-16-Hive HiveServer2 HS2 允许客户端远程执行HiveHQL HCatalog 集群规划 实机配置运行
41 3
|
1月前
|
SQL 存储 数据管理
Hadoop-15-Hive 元数据管理与存储 Metadata 内嵌模式 本地模式 远程模式 集群规划配置 启动服务 3节点云服务器实测
Hadoop-15-Hive 元数据管理与存储 Metadata 内嵌模式 本地模式 远程模式 集群规划配置 启动服务 3节点云服务器实测
57 2
|
3月前
|
SQL 存储 关系型数据库
|
4月前
|
SQL 关系型数据库 MySQL
实时计算 Flink版产品使用问题之如何使用Flink SQL连接带有Kerberos认证的Hive
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStream API、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
6月前
|
SQL 存储 分布式计算
Hive详解、配置、数据结构、Hive CLI
Hive详解、配置、数据结构、Hive CLI
119 0
Hive详解、配置、数据结构、Hive CLI
|
6月前
|
SQL 分布式计算 资源调度
一文看懂 Hive 优化大全(参数配置、语法优化)
以下是对提供的内容的摘要,总长度为240个字符: 在Hadoop集群中,服务器环境包括3台机器,分别运行不同的服务,如NodeManager、DataNode、NameNode等。集群组件版本包括jdk 1.8、mysql 5.7、hadoop 3.1.3和hive 3.1.2。文章讨论了YARN的配置优化,如`yarn.nodemanager.resource.memory-mb`、`yarn.nodemanager.vmem-check-enabled`和`hive.map.aggr`等参数,以及Map-Side聚合优化、Map Join和Bucket Map Join。
334 0
|
6月前
|
SQL HIVE
Hive【基础知识 04】【Hive 属性配置的三种方式及配置的优先级说明】
【4月更文挑战第7天】Hive【基础知识 04】【Hive 属性配置的三种方式及配置的优先级说明】
111 0
|
6月前
|
SQL Java Shell
Hive【非交互式使用、三种参数配置方式】
Hive【非交互式使用、三种参数配置方式】
|
SQL 分布式计算 Hadoop
配置Hive使用Spark执行引擎
在Hive中,可以通过配置来指定使用不同的执行引擎。Hive执行引擎包括:默认MR、tez、spark。
272 0