基于Hadoop的数据仓库Hive安装
1、安装Hive
1.1 下载Hive源程序
Apache官方:https://www.apache.org/dyn/closer.cgi/hive/
清华大学镜像:https://mirrors.tuna.tsinghua.edu.cn/apache/hive/
在Ubuntu中,使用wget命令下载:
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
好像失败了(网速问题),算了,还是用Xshell传过来吧!!
1.2 解压并重命名
sudo tar -zxvf ./apache-hive-3.1.3-bin.tar.gz -C /usr/local # 解压到 /usr/local 中 sudo mv apache-hive-3.1.3-bin hive # 重命名为hive
1.3 修改文件权限
sudo chown -R hadoop:hadoop hive
注意:上面的hadoop:hadoop是用户组和用户名,如果你当前使用用户名user_name登录了Linux系统,则把hadoop替换成user_name。
1.4 配置环境变量
为了方便使用,我们把hive命令加入到环境变量中去,使用vim编辑器打开.bashrc
文件,命令如下:
sudo vi ~/.bashrc
添加如下内容:
export HIVE_HOME=/usr/local/hive export PATH=$PATH:$HIVE_HOME/bin export HADOOP_HOME=/usr/local/hadoop
HADOOP_HOME需要被配置成你系统上Hadoop的安装路径,比如这里是安装在
/usr/local/hadoop
目录。
保存退出后,运行如下命令使配置立即生效:
source ~/.bashrc
1.5 配置hive-site.xml
修改/usr/local/hive/conf
下的hive-site.xml
,执行如下命令:
cd /usr/local/hive/conf sudo mv hive-default.xml.template hive-default.xml
上面命令是将hive-default.xml.template
重命名为hive-default.xml
。
然后,使用vim编辑器新建一个配置文件hive-site.xml
,命令如下:
cd /usr/local/hive/conf sudo vi hive-site.xml
在hive-site.xml
中添加如下配置信息:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password to use against metastore database</description> </property> </configuration>
2、安装并配置MySQL
这里我们采用MySQL数据库保存Hive的元数据,而不是采用Hive自带的derby来存储元数据。
Ubuntu下MySQL的安装参考:Ubuntu安装MySQL及常用操作
2.1 下载mysql jdbc包
下载地址:https://dev.mysql.com/downloads/connector/j/
在Xshell中上传:
2.2 解压并拷贝
tar -zxvf mysql-connector-j-8.0.31.tar.gz
将mysql-connector-j-8.0.31.jar
拷贝到/usr/local/hive/lib
目录下:
cd 下载 cd mysql-connector-j-8.0.31 sudo cp mysql-connector-j-8.0.31.jar /usr/local/hive/lib
2.3 启动并登陆mysql shell
service mysql start # 启动mysql服务 mysql -u root -p # 登陆shell界面
2.4 新建hive数据库
create database hive;
这个hive数据库与hive-site.xml中
localhost:3306/hive
的hive对应,用来保存hive元数据。
2.5 配置mysql允许hive接入
grant all on *.* to hive@localhost identified by 'hive'; # 将所有数据库的所有表的所有权限赋给hive用户,后面的hive是配置hive-site.xml中配置的连接密码 flush privileges; # 刷新mysql系统权限关系表
会报错!参考博客:grant all on . to hive@localhost identified by ‘hive’; ERROR 1064 (42000): You have an error in yo
改为如下代码:
create user 'hive'@'localhost' identified by 'hive'; grant all on *.* to 'hive'@'localhost'; flush privileges;
2.6 启动hadoop
启动hive之前,请先启动hadoop集群:
cd /usr/local/hadoop ./sbin/start-all.sh jps # 查看进程(6个为正常)
2.7 启动hive
cd /usr/local/hive ./bin/hive
尝试一下以下方法:
./bin/schematool -dbType mysql -initSchema
还是不可以啊!!!【见Bug1】
2.8 退出hive
exit;
3、Bug1(已解决)
参考博客:Hive初始化报错Exception in thread “main“ java.lang.NoSuchMethodError: com.google.common.base.
报错原因: 因为hadoop和hive的两个guava.jar
版本不一致
解决方案:
(1)删除hive里的guava.jar:
cd /usr/local/hive/lib sudo rm guava-19.0.jar
(2)把hadoop里的guava.jar复制到hive里:
cd /usr/local/hadoop/share/hadoop/common/lib # 进入hadoop cp -r guava-27.0-jre.jar /usr/local/hive/lib # 复制到hive中
(3)初始化hive:
./bin/schematool -dbType mysql -initSchema
(4)再次启动hive:
cd /usr/local/hive ./bin/hive
4、Bug2(待解决)
当启动hadoop或hive时,都会报出如下错误:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
5、Bug3(待解决)
当启动hive时,会报出如下错误:
WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored