mac os 下安装hadoop-2.7.3+hive-2.1.1+sqoop-1.99.3

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 高可用系列,价值2615元额度,1个月
简介: hadoop+hive+sqoop安装与使用

hadoop 安装

安装jdk

vim ~/.bash_profile

export JAVA_HOME="YOUR_JAVA_HOME"
export PATH=$PATH:$JAVA_HOME/bin

配置完成后,运行

java -version
--------------
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

ssh免密登入

ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

ssh localhost # 验证

配置hadoop

下载hadoop,解压到指定目录,这里是/opt
配置系统变量

vim ~/.bash_profile

export HADOOP_HOME=/opt/hadoop-2.7.3
export HADOOP_PREFIX=$HADOOP_HOME
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

修改 /etc/hadoop/hadoop-env.sh

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home
export HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop
export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"

修改/etc/hadoop/core-site.xml

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
    <description>The name of the default file system.</description>
  </property>

  <property>
    <name>hadoop.tmp.dir</name>
    <value>/Users/micmiu/tmp/hadoop</value>
    <description>A base for other temporary directories.</description>
  </property>

  <property>
    <name>io.native.lib.available</name>
    <value>false</value>
    <description>default value is true:Should native hadoop libraries, if present, be used.</description>
  </property>

修改hdfs-site.xml

<property>
        <name>dfs.replication</name>
        <value>1</value>
        <!--如果是单节点配置为1,如果是集群根据实际集群数量配置 -->
</property>

修改yarn-site.xml

<property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

修改mapred-site.xml

cp mapred-site.xml.template mapred-site.xml.
<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
        <final>true</final>
</property>

格式化namenode

hadoop namenode -format

启动hdfs和yarn

start-dfs.sh
start-yarn.sh

查看守护进程是否开启

jps

6917 DataNode
6838 NameNode
2810 Launcher
7130 ResourceManager
7019 SecondaryNameNode
7772 Jps
7215 NodeManager

wordcount示例

hdfs dfs -mkdir -p /user/jjzhu/wordcount/in
hdfs dfs -put xxxxx.txt /user/jjzhu/wordcount/in
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /user/jjzhu/wordcount/in /user/jjzhu/wordcount/out

运行过程

17/04/07 13:04:10 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/04/07 13:04:10 INFO input.FileInputFormat: Total input paths to process : 1
17/04/07 13:04:10 INFO mapreduce.JobSubmitter: number of splits:1
17/04/07 13:04:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491532908338_0004
17/04/07 13:04:11 INFO impl.YarnClientImpl: Submitted application application_1491532908338_0004
17/04/07 13:04:11 INFO mapreduce.Job: The url to track the job: http://jjzhu:8088/proxy/application_1491532908338_0004/
17/04/07 13:04:11 INFO mapreduce.Job: Running job: job_1491532908338_0004
17/04/07 13:04:18 INFO mapreduce.Job: Job job_1491532908338_0004 running in uber mode : false
17/04/07 13:04:18 INFO mapreduce.Job:  map 0% reduce 0%
17/04/07 13:04:23 INFO mapreduce.Job:  map 100% reduce 0%
17/04/07 13:04:29 INFO mapreduce.Job:  map 100% reduce 100%
17/04/07 13:04:29 INFO mapreduce.Job: Job job_1491532908338_0004 completed successfully
17/04/07 13:04:29 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=1141
        FILE: Number of bytes written=239913
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=869
        HDFS: Number of bytes written=779
        HDFS: Number of read operations=6
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=2859
        Total time spent by all reduces in occupied slots (ms)=2527
        Total time spent by all map tasks (ms)=2859
        Total time spent by all reduce tasks (ms)=2527
        Total vcore-milliseconds taken by all map tasks=2859
        Total vcore-milliseconds taken by all reduce tasks=2527
        Total megabyte-milliseconds taken by all map tasks=2927616
        Total megabyte-milliseconds taken by all reduce tasks=2587648
    Map-Reduce Framework
        Map input records=1
        Map output records=118
        Map output bytes=1219
        Map output materialized bytes=1141
        Input split bytes=122
        Combine input records=118
        Combine output records=89
        Reduce input groups=89
        Reduce shuffle bytes=1141
        Reduce input records=89
        Reduce output records=89
        Spilled Records=178
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=103
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
        Total committed heap usage (bytes)=329252864
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=747
    File Output Format Counters 
        Bytes Written=779

查看结果

hdfs dfs -ls /user/jjzhu/wordcount/out

-rw-r--r--   1 didi supergroup          0 2017-04-07 13:04 /user/jjzhu/wordcount/out/_SUCCESS
-rw-r--r--   1 didi supergroup        779 2017-04-07 13:04 /user/jjzhu/wordcount/out/part-r-00000

hdfs dfs -cat /user/jjzhu/wordcount/out/part-r-00000

A    1
Other    1
Others    1
Some    2
There    1
a    1
access    2
access);    1
according    1
adding    1
allowing    1
......

关闭hadoop

stop-hdfs.sh
stop-yarn.sh

安装hive

下载解压配置环境变量
export HIVE_HOME=/opt/hive-2.1.1
export PATH=$HIVE_HOME/bin:$PATH

配置hive

cd /opt/hive/conf
cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml

vim hive-env.sh
HADOOP_HOME=/opt/hadoop-2.7.3
export HIVE_CONF_DIR=/opt/hive-2.1.1/conf
export HIVE_AUX_JARS_PATH=/opt/hive-2.1.1//lib

下载mysql-connector-xx.xx.xx.jar 到lib下

vim hive-site.xml
将所有${system:java.io.tmpdir} 和 ${system:user.name}替换
并配置mysql数据库连接信息

<property>
 <name>javax.jdo.option.ConnectionURL</name>
 <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8&amp;useSSL=false</value>
</property>
<property>
 <name>javax.jdo.option.ConnectionDriverName</name>
 <value>com.mysql.jdbc.Driver</value>
</property>
<property>
 <name>javax.jdo.option.ConnectionUserName</name>
 <value>root</value>
</property>
<property>
 <name>javax.jdo.option.ConnectionPassword</name>
 <value>123456</value>
</property>

为hive创建HDFS目录

hdfs dfs -mkdir -p  /usr/hive/warehouse

hdfs dfs -mkdir -p /usr/hive/tmp

hdfs dfs -mkdir -p /usr/hive/log

hdfs dfs -chmod -R  777 /usr/hive

初始化数据库

./bin/schematool -initSchema -dbType mysql

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
mysql> use hive;
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_hive            |
+---------------------------+
| AUX_TABLE                 |
| BUCKETING_COLS            |
| SORT_COLS                 |
| TABLE_PARAMS              |
| TAB_COL_STATS             |
| TBLS                      |
| TBL_COL_PRIVS             |
| TBL_PRIVS                 |
| TXNS                      |
| TXN_COMPONENTS            |
| TYPES                     |
| TYPE_FIELDS               |
| VERSION                   |
| WRITE_SET                 |
+---------------------------+

启动hive

jjzhu:opt didi$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/opt/hive-2.1.1/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> 

安装sqoop

下载解压配置环境变量

export $SQOOP_HOME=/opt/sqoop-1.99.7
export SQOOP_SERVER_EXTRA_LIB=$SQOOP_HOME/extra  
export PATH=$SQOOP_HOME/bin:$PATH

修改sqoop配置

在conf目录下的两个主要配置文件sqoop.properties和sqoop_bootstrap.properties
主要修改sqoop.properties

org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/opt/hadoop-2.7.3/etc/hadoop  
  
org.apache.sqoop.security.authentication.type=SIMPLE  
org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.SimpleAuthenticationHandler  
org.apache.sqoop.security.authentication.anonymous=true  

验证配置是否有效

jzhu:bin didi$ ./sqoop2-tool verify
Setting conf dir: /opt/sqoop-1.99.7/bin/../conf
Sqoop home directory: /opt/sqoop-1.99.7
Sqoop tool executor:
    Version: 1.99.7
    Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb
    Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine
Running tool: class org.apache.sqoop.tools.tool.VerifyTool
0    [main] INFO  org.apache.sqoop.core.SqoopServer  - Initializing Sqoop server.
12   [main] INFO  org.apache.sqoop.core.PropertiesConfigurationProvider  - Starting config file poller thread
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Verification was successful.
Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.
jjzhu:bin didi$ 

开启服务器

./bin/sqoop2-server start

jps

9505 SqoopJettyServer
....
相关实践学习
如何在云端创建MySQL数据库
开始实验后,系统会自动创建一台自建MySQL的 源数据库 ECS 实例和一台 目标数据库 RDS。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助 &nbsp; &nbsp; 相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
1月前
|
NoSQL 数据可视化 Redis
Mac安装Redis
Mac安装Redis
35 3
|
1月前
|
数据安全/隐私保护 iOS开发 MacOS
Mac安装Navicat Premium 16.3.5
Mac安装Navicat Premium 16.3.5
90 3
|
15天前
|
Windows
Windows操作系统部署安装Kerberos客户端
详细介绍了在Windows操作系统上部署安装Kerberos客户端的完整过程,包括下载安装包、安装步骤、自定义安装路径、修改环境变量、配置hosts文件和Kerberos配置文件,以及安装后的验证步骤。
29 3
Windows操作系统部署安装Kerberos客户端
|
16天前
|
Ubuntu 网络安全 开发工具
Ubuntu19.04的安装过程详解以及操作系统初始化配置
本文详细介绍了Ubuntu 19.04操作系统的安装过程、初始化配置、网络设置、软件源配置、SSH远程登录以及终端显示设置。
37 1
Ubuntu19.04的安装过程详解以及操作系统初始化配置
|
13天前
|
Web App开发 开发工具 Android开发
【Flutter】Flutter安装和配置(mac)
【Flutter】Flutter安装和配置(mac)
|
29天前
|
编解码 Linux 虚拟化
超详细VMware虚拟机安装Win10操作系统过程图解
这篇文章提供了一个详细的VMware虚拟机安装Windows 10操作系统的图解教程,包括了从创建虚拟机到安装操作系统的全过程,以及安装后的一些基本设置,如屏幕分辨率调整等。作者还提到了后续会分享关于磁盘分区的创建过程。
超详细VMware虚拟机安装Win10操作系统过程图解
|
17天前
|
Shell 数据安全/隐私保护
Mac上HomeBrew安装及换源教程
【8月更文挑战第30天】这是在 Mac 上安装及更换 Homebrew 源的教程。首先通过终端执行命令 `/bin/bash -c &quot;\$\(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh\)` 安装 Homebrew,并使用 `brew --version` 检查是否安装成功。接着可更换软件源以提高下载速度,例如设置中科大为源,并更新相关设置。这将有助于提升 Homebrew 的使用体验。
155 9
|
1月前
|
应用服务中间件 PHP nginx
Mac安装Nginx
Mac安装Nginx
22 2
Mac安装Nginx
|
1月前
|
Ubuntu 安全 iOS开发
Kylin操作系统安装及使用指南
Kylin操作系统安装及使用指南
|
1月前
|
网络安全 开发工具 git
Mac安装Git
Mac安装Git
23 2