Hive的安装模式有三种:
- 嵌入模式
- 本地模式
- 远程模式
(1) 嵌入模式
基于系统本身的数据库derby数据库进行存储元数据,该模式是默认安装方式,配置简单
缺点: 一次只能连接一个客户端,仅适合在测试环境内使用
(2)本地模式
采用外部数据库进行存储元数据,该模式下不需要单独开启MetaStore服务
(3)远程模式
采用外部数据库进行存储元数据,该模式下需要单独开启MetaStore服务
嵌入模式
- 创建存储hive的磁盘目录
- 上传hive的二进制版的安装包
- 解压缩
- 进入hive\bin目录内,执行hive
- 测试
- 查看所有数据库
# 创建数据库 show databases; # 创建数据库 create database book_hive ;
- 测试元数据存储是否同步
[root@hadoop1 ~]# cd /usr/local/hive/ [root@hadoop1 hive]# ll 总用量 88732 drwxr-xr-x. 8 root root 159 4月 23 15:14 apache-hive-1.2.2-bin -rw-r--r--. 1 root root 90859180 6月 6 2019 apache-hive-1.2.2-bin.tar.gz [root@hadoop1 hive]# cd apache-hive-1.2.2-bin/ [root@hadoop1 apache-hive-1.2.2-bin]# ll 总用量 60 drwxr-xr-x. 4 root root 156 4月 23 15:15 bin drwxr-xr-x. 2 root root 212 4月 23 15:14 conf drwxr-xr-x. 4 root root 34 4月 23 15:14 examples drwxr-xr-x. 7 root root 68 4月 23 15:14 hcatalog drwxr-xr-x. 4 root root 8192 4月 23 15:14 lib -rw-r--r--. 1 root root 24754 3月 31 2017 LICENSE -rw-r--r--. 1 root root 397 3月 31 2017 NOTICE -rw-r--r--. 1 root root 4374 4月 1 2017 README.txt -rw-r--r--. 1 root root 4255 4月 1 2017 RELEASE_NOTES.txt drwxr-xr-x. 3 root root 23 4月 23 15:14 scripts # 重点在第20行 运行的命令 [root@hadoop1 apache-hive-1.2.2-bin]# bin/hive Logging initialized using configuration in jar:file:/usr/local/hive/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties hive> show databases; OK default Time taken: 0.972 seconds, Fetched: 1 row(s) hive>
[root@hadoop1 bin]# ./hive Logging initialized using configuration in jar:file:/usr/local/hive/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties hive> hive> show databases; OK book_hive default Time taken: 0.617 seconds, Fetched: 2 row(s) hive>
本地模式
- 首先需要安装MySQL数据库
- 离线安装
- 基于rpm可执行文件进行安装
- 基于tar.gz解压缩文件进行安装
- 在线安装
- 基于yum实现在线安装更新
- 基于wget获取MySQL的Repo
采用yum实现在线安装更新
# 1. 获取下载mysql5.7的repo [root@hadoop1 local]# wget http://repo.mysql.com/mysql57-community-release-el7-10.noarch.rpm [root@hadoop1 mysql]# ll 总用量 28 -rw-r--r--. 1 root root 25548 4月 7 2017 mysql57-community-release-el7-10.noarch.rpm # 2. 安装mysql5.7的可执行文件 [root@hadoop1 local]# rpm -ivh mysql57-community-release-el7-10.noarch.rpm [root@hadoop1 local]# rm /etc/pki/rpm-gpg/RPM-GPG-KEY-mysql [root@hadoop1 local]# rpm --import https://repo.mysql.com/RPM-GPG-KEY-mysql-2022 [root@hadoop1 local]# yum install mysql-server
[root@hadoop1 mysql]# mysql ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' # 原因是没有输入用户名和密码 关键也不知道初始密码是什么? #解决方案: #方式一: 查询初始化密码 #方式二: 跳过权限认证 # 采用第二个方案 [root@hadoop1 mysql]# vi /etc/my.cnf # 在【mysqld】的空白处,添加如下三个单词:skip-grant-tables # 示例: [mysqld] # # Remove leading # and set to the amount of RAM for the most important data # cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%. # innodb_buffer_pool_size = 128M # # Remove leading # to turn on a very important data integrity option: logging # changes to the binary log between backups. # log_bin # # Remove leading # to set options mainly useful for reporting servers. # The server defaults are faster for transactions and fast SELECTs. # Adjust sizes as needed, experiment to find the optimal values. # join_buffer_size = 128M # sort_buffer_size = 2M # read_rnd_buffer_size = 2M datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock ######## 跳过MySQL的权限机制 就不需要输入密码 进去 更改密码 再将本命令删除 或 注释 skip-grant-tables #保存退出! #重启MySQL服务 [root@hadoop1 mysql]# systemctl restart mysqld #进入MySQL软件 [root@hadoop1 etc]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.37 MySQL Community Server (GPL) Copyright (c) 2000, 2022, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. # 1. 查询所有数据库 mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys | +--------------------+ 4 rows in set (0.01 sec) # 2.切入到MySQL数据库内: mysql> use mysql Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed # 3.查询所有数据表 mysql> show tables; +---------------------------+ | Tables_in_mysql | +---------------------------+ | columns_priv | | db | | engine_cost | | event | | func | | general_log | | gtid_executed | | help_category | | help_keyword | | help_relation | | help_topic | | innodb_index_stats | | innodb_table_stats | | ndb_binlog_index | | plugin | | proc | | procs_priv | | proxies_priv | | server_cost | | servers | | slave_master_info | | slave_relay_log_info | | slave_worker_info | | slow_log | | tables_priv | | time_zone | | time_zone_leap_second | | time_zone_name | | time_zone_transition | | time_zone_transition_type | | user | +---------------------------+ 31 rows in set (0.00 sec) # 4. 查询user表的表结构 mysql> desc user; +------------------------+-----------------------------------+------+-----+-----------------------+-------+ | Field | Type | Null | Key | Default | Extra | +------------------------+-----------------------------------+------+-----+-----------------------+-------+ | Host | char(60) | NO | PRI | | | | User | char(32) | NO | PRI | | | | Select_priv | enum('N','Y') | NO | | N | | | Insert_priv | enum('N','Y') | NO | | N | | | Update_priv | enum('N','Y') | NO | | N | | | Delete_priv | enum('N','Y') | NO | | N | | | Create_priv | enum('N','Y') | NO | | N | | | Drop_priv | enum('N','Y') | NO | | N | | | Reload_priv | enum('N','Y') | NO | | N | | | Shutdown_priv | enum('N','Y') | NO | | N | | | Process_priv | enum('N','Y') | NO | | N | | | File_priv | enum('N','Y') | NO | | N | | | Grant_priv | enum('N','Y') | NO | | N | | | References_priv | enum('N','Y') | NO | | N | | | Index_priv | enum('N','Y') | NO | | N | | | Alter_priv | enum('N','Y') | NO | | N | | | Show_db_priv | enum('N','Y') | NO | | N | | | Super_priv | enum('N','Y') | NO | | N | | | Create_tmp_table_priv | enum('N','Y') | NO | | N | | | Lock_tables_priv | enum('N','Y') | NO | | N | | | Execute_priv | enum('N','Y') | NO | | N | | | Repl_slave_priv | enum('N','Y') | NO | | N | | | Repl_client_priv | enum('N','Y') | NO | | N | | | Create_view_priv | enum('N','Y') | NO | | N | | | Show_view_priv | enum('N','Y') | NO | | N | | | Create_routine_priv | enum('N','Y') | NO | | N | | | Alter_routine_priv | enum('N','Y') | NO | | N | | | Create_user_priv | enum('N','Y') | NO | | N | | | Event_priv | enum('N','Y') | NO | | N | | | Trigger_priv | enum('N','Y') | NO | | N | | | Create_tablespace_priv | enum('N','Y') | NO | | N | | | ssl_type | enum('','ANY','X509','SPECIFIED') | NO | | | | | ssl_cipher | blob | NO | | NULL | | | x509_issuer | blob | NO | | NULL | | | x509_subject | blob | NO | | NULL | | | max_questions | int(11) unsigned | NO | | 0 | | | max_updates | int(11) unsigned | NO | | 0 | | | max_connections | int(11) unsigned | NO | | 0 | | | max_user_connections | int(11) unsigned | NO | | 0 | | | plugin | char(64) | NO | | mysql_native_password | | | authentication_string | text | YES | | NULL | | | password_expired | enum('N','Y') | NO | | N | | | password_last_changed | timestamp | YES | | NULL | | | password_lifetime | smallint(5) unsigned | YES | | NULL | | | account_locked | enum('N','Y') | NO | | N | | +------------------------+-----------------------------------+------+-----+-----------------------+-------+ 45 rows in set (0.00 sec) # 通过查看表结构发现没有传统的Password字段,是authentication_string字段代表Password mysql> update mysql.user set authentication_string = password('sorry') where user = 'root'; Query OK, 1 row affected, 1 warning (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 1 # 修改密码成功后 ,需要删除MySQL登录的权限验证 [root@hadoop1 ~]# vi /etc/my.cnf # skip-grant-tables # 跳过MySQL的权限机制 就不需要输入密码 进去 更改密码 再将本命令删除 或 注释 #设置密码不过期 default_password_lifetime=0 [root@hadoop1 ~]# systemctl restart mysqld
再次登录MySQL
[root@hadoop1 etc]# mysql ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO) [root@hadoop1 etc]# mysql -uroot -psorry mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 Server version: 5.7.37 Copyright (c) 2000, 2022, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. # 1. 修改全局验证策略 mysql> set global validate_password_policy=0; # 2. 修改全局验证密码的最小长度 mysql> set global validate_password_length=4; # 3. 修改USER 表中的 USER 函数 密码设置为 sorry mysql> ALTER USER USER() IDENTIFIED BY 'sorry';
Hive配置
- 修改hive-env.sh文件
[root@hadoop1 apache-hive-1.2.2-bin]# ll 总用量 84 drwxr-xr-x. 4 root root 156 4月 23 15:15 bin drwxr-xr-x. 2 root root 212 4月 23 15:14 conf -rw-r--r--. 1 root root 21108 4月 23 15:20 derby.log drwxr-xr-x. 4 root root 34 4月 23 15:14 examples drwxr-xr-x. 7 root root 68 4月 23 15:14 hcatalog drwxr-xr-x. 4 root root 8192 4月 23 15:14 lib -rw-r--r--. 1 root root 24754 3月 31 2017 LICENSE drwxr-xr-x. 5 root root 133 4月 23 15:20 metastore_db -rw-r--r--. 1 root root 397 3月 31 2017 NOTICE -rw-r--r--. 1 root root 4374 4月 1 2017 README.txt -rw-r--r--. 1 root root 4255 4月 1 2017 RELEASE_NOTES.txt drwxr-xr-x. 3 root root 23 4月 23 15:14 scripts [root@hadoop1 apache-hive-1.2.2-bin]# cd conf/ [root@hadoop1 conf]# ll 总用量 188 -rw-r--r--. 1 root root 1139 3月 31 2017 beeline-log4j.properties.template -rw-r--r--. 1 root root 168715 4月 1 2017 hive-default.xml.template -rw-r--r--. 1 root root 2378 3月 31 2017 hive-env.sh.template -rw-r--r--. 1 root root 2662 3月 31 2017 hive-exec-log4j.properties.template -rw-r--r--. 1 root root 3050 3月 31 2017 hive-log4j.properties.template -rw-r--r--. 1 root root 1593 4月 1 2017 ivysettings.xml [root@hadoop1 conf]# # 将 hive-env.sh.template 文件 拷贝一份 并更改名称为hive-env.sh [root@hadoop1 conf]# cp hive-env.sh.template hive-env.sh [root@hadoop1 conf]# ll 总用量 192 -rw-r--r--. 1 root root 1139 3月 31 2017 beeline-log4j.properties.template -rw-r--r--. 1 root root 168715 4月 1 2017 hive-default.xml.template -rw-r--r--. 1 root root 2378 4月 23 17:18 hive-env.sh -rw-r--r--. 1 root root 2378 3月 31 2017 hive-env.sh.template -rw-r--r--. 1 root root 2662 3月 31 2017 hive-exec-log4j.properties.template -rw-r--r--. 1 root root 3050 3月 31 2017 hive-log4j.properties.template -rw-r--r--. 1 root root 1593 4月 1 2017 ivysettings.xml
- 打开hive-env.sh文件,修改hadoop的环节变量
[root@hadoop1 conf]# vim hive-env.sh # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Set Hive and Hadoop environment variables here. These variables can be used # to control the execution of Hive. It should be used by admins to configure # the Hive installation (so that users do not have to set environment variables # or set command line parameters to get correct behavior). # # The hive service being invoked (CLI/HWI etc.) is available via the environment # variable SERVICE # Hive Client memory usage can be an issue if a large number of clients # are running at the same time. The flags below have been useful in # reducing memory usage: # # if [ "$SERVICE" = "cli" ]; then # if [ -z "$DEBUG" ]; then # export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+UseParNewGC -XX:-UseGCOverheadLimit" # else # export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit" # fi # fi # The heap size of the jvm stared by hive shell script can be controlled via: # # export HADOOP_HEAPSIZE=1024 # # Larger heap size may be required when running queries over large number of files or partitions. # By default hive shell scripts use a heap size of 256 (MB). Larger heap size would also be # appropriate for hive server (hwi etc). # Set HADOOP_HOME to point to a specific hadoop install directory # HADOOP_HOME=${bin}/../../hadoop # 添加Hadoop的环境变量 export HADOOP_HOME=/usr/local/hadoop/hadoop-2.7.4 #Hive Configuration Directory can be controlled by: # export HIVE_CONF_DIR= # Folder containing extra ibraries required for hive compilation/execution can be controlled by: # export HIVE_AUX_JARS_PATH=
- 创建hive-site.xml文件,添加相关配置
[root@hadoop1 conf]# pwd /usr/local/hive/apache-hive-1.2.2-bin/conf [root@hadoop1 conf]# ll 总用量 192 -rw-r--r--. 1 root root 1139 3月 31 2017 beeline-log4j.properties.template -rw-r--r--. 1 root root 168715 4月 1 2017 hive-default.xml.template -rw-r--r--. 1 root root 2458 4月 23 17:22 hive-env.sh -rw-r--r--. 1 root root 2378 3月 31 2017 hive-env.sh.template -rw-r--r--. 1 root root 2662 3月 31 2017 hive-exec-log4j.properties.template -rw-r--r--. 1 root root 3050 3月 31 2017 hive-log4j.properties.template -rw-r--r--. 1 root root 1593 4月 1 2017 ivysettings.xml # 通过查询,发现hive-site.xml文件不存在!
- hive-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!--MySQL链接协议--> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://127.0.0.1:3306/hive?createDatabaseIFNotExist=true</value> </property> <!--MySQL驱动 jar需要单独上传--> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <!--MySQL用户名--> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <!--MySQL密码--> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>sorry</value> </property> </configuration>
杨校老师课题之Hive数据仓库搭建2:https://developer.aliyun.com/article/1543267