mac os 下安装hadoop-2.7.3+hive-2.1.1+sqoop-1.99.3

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
RDS MySQL Serverless 高可用系列,价值2615元额度,1个月
云数据库 RDS PostgreSQL,高可用系列 2核4GB
简介: hadoop+hive+sqoop安装与使用

hadoop 安装

安装jdk

vim ~/.bash_profile

export JAVA_HOME="YOUR_JAVA_HOME"
export PATH=$PATH:$JAVA_HOME/bin

配置完成后,运行

java -version
--------------
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

ssh免密登入

ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

ssh localhost # 验证

配置hadoop

下载hadoop,解压到指定目录,这里是/opt
配置系统变量

vim ~/.bash_profile

export HADOOP_HOME=/opt/hadoop-2.7.3
export HADOOP_PREFIX=$HADOOP_HOME
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

修改 /etc/hadoop/hadoop-env.sh

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home
export HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop
export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"

修改/etc/hadoop/core-site.xml

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
    <description>The name of the default file system.</description>
  </property>

  <property>
    <name>hadoop.tmp.dir</name>
    <value>/Users/micmiu/tmp/hadoop</value>
    <description>A base for other temporary directories.</description>
  </property>

  <property>
    <name>io.native.lib.available</name>
    <value>false</value>
    <description>default value is true:Should native hadoop libraries, if present, be used.</description>
  </property>

修改hdfs-site.xml

<property>
        <name>dfs.replication</name>
        <value>1</value>
        <!--如果是单节点配置为1,如果是集群根据实际集群数量配置 -->
</property>

修改yarn-site.xml

<property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

修改mapred-site.xml

cp mapred-site.xml.template mapred-site.xml.
<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
        <final>true</final>
</property>

格式化namenode

hadoop namenode -format

启动hdfs和yarn

start-dfs.sh
start-yarn.sh

查看守护进程是否开启

jps

6917 DataNode
6838 NameNode
2810 Launcher
7130 ResourceManager
7019 SecondaryNameNode
7772 Jps
7215 NodeManager

wordcount示例

hdfs dfs -mkdir -p /user/jjzhu/wordcount/in
hdfs dfs -put xxxxx.txt /user/jjzhu/wordcount/in
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /user/jjzhu/wordcount/in /user/jjzhu/wordcount/out

运行过程

17/04/07 13:04:10 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/04/07 13:04:10 INFO input.FileInputFormat: Total input paths to process : 1
17/04/07 13:04:10 INFO mapreduce.JobSubmitter: number of splits:1
17/04/07 13:04:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491532908338_0004
17/04/07 13:04:11 INFO impl.YarnClientImpl: Submitted application application_1491532908338_0004
17/04/07 13:04:11 INFO mapreduce.Job: The url to track the job: http://jjzhu:8088/proxy/application_1491532908338_0004/
17/04/07 13:04:11 INFO mapreduce.Job: Running job: job_1491532908338_0004
17/04/07 13:04:18 INFO mapreduce.Job: Job job_1491532908338_0004 running in uber mode : false
17/04/07 13:04:18 INFO mapreduce.Job:  map 0% reduce 0%
17/04/07 13:04:23 INFO mapreduce.Job:  map 100% reduce 0%
17/04/07 13:04:29 INFO mapreduce.Job:  map 100% reduce 100%
17/04/07 13:04:29 INFO mapreduce.Job: Job job_1491532908338_0004 completed successfully
17/04/07 13:04:29 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=1141
        FILE: Number of bytes written=239913
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=869
        HDFS: Number of bytes written=779
        HDFS: Number of read operations=6
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=2859
        Total time spent by all reduces in occupied slots (ms)=2527
        Total time spent by all map tasks (ms)=2859
        Total time spent by all reduce tasks (ms)=2527
        Total vcore-milliseconds taken by all map tasks=2859
        Total vcore-milliseconds taken by all reduce tasks=2527
        Total megabyte-milliseconds taken by all map tasks=2927616
        Total megabyte-milliseconds taken by all reduce tasks=2587648
    Map-Reduce Framework
        Map input records=1
        Map output records=118
        Map output bytes=1219
        Map output materialized bytes=1141
        Input split bytes=122
        Combine input records=118
        Combine output records=89
        Reduce input groups=89
        Reduce shuffle bytes=1141
        Reduce input records=89
        Reduce output records=89
        Spilled Records=178
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=103
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
        Total committed heap usage (bytes)=329252864
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=747
    File Output Format Counters 
        Bytes Written=779

查看结果

hdfs dfs -ls /user/jjzhu/wordcount/out

-rw-r--r--   1 didi supergroup          0 2017-04-07 13:04 /user/jjzhu/wordcount/out/_SUCCESS
-rw-r--r--   1 didi supergroup        779 2017-04-07 13:04 /user/jjzhu/wordcount/out/part-r-00000

hdfs dfs -cat /user/jjzhu/wordcount/out/part-r-00000

A    1
Other    1
Others    1
Some    2
There    1
a    1
access    2
access);    1
according    1
adding    1
allowing    1
......

关闭hadoop

stop-hdfs.sh
stop-yarn.sh

安装hive

下载解压配置环境变量
export HIVE_HOME=/opt/hive-2.1.1
export PATH=$HIVE_HOME/bin:$PATH

配置hive

cd /opt/hive/conf
cp hive-env.sh.template hive-env.sh
cp hive-default.xml.template hive-site.xml

vim hive-env.sh
HADOOP_HOME=/opt/hadoop-2.7.3
export HIVE_CONF_DIR=/opt/hive-2.1.1/conf
export HIVE_AUX_JARS_PATH=/opt/hive-2.1.1//lib

下载mysql-connector-xx.xx.xx.jar 到lib下

vim hive-site.xml
将所有${system:java.io.tmpdir} 和 ${system:user.name}替换
并配置mysql数据库连接信息

<property>
 <name>javax.jdo.option.ConnectionURL</name>
 <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8&amp;useSSL=false</value>
</property>
<property>
 <name>javax.jdo.option.ConnectionDriverName</name>
 <value>com.mysql.jdbc.Driver</value>
</property>
<property>
 <name>javax.jdo.option.ConnectionUserName</name>
 <value>root</value>
</property>
<property>
 <name>javax.jdo.option.ConnectionPassword</name>
 <value>123456</value>
</property>

为hive创建HDFS目录

hdfs dfs -mkdir -p  /usr/hive/warehouse

hdfs dfs -mkdir -p /usr/hive/tmp

hdfs dfs -mkdir -p /usr/hive/log

hdfs dfs -chmod -R  777 /usr/hive

初始化数据库

./bin/schematool -initSchema -dbType mysql

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
mysql> use hive;
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_hive            |
+---------------------------+
| AUX_TABLE                 |
| BUCKETING_COLS            |
| SORT_COLS                 |
| TABLE_PARAMS              |
| TAB_COL_STATS             |
| TBLS                      |
| TBL_COL_PRIVS             |
| TBL_PRIVS                 |
| TXNS                      |
| TXN_COMPONENTS            |
| TYPES                     |
| TYPE_FIELDS               |
| VERSION                   |
| WRITE_SET                 |
+---------------------------+

启动hive

jjzhu:opt didi$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/opt/hive-2.1.1/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> 

安装sqoop

下载解压配置环境变量

export $SQOOP_HOME=/opt/sqoop-1.99.7
export SQOOP_SERVER_EXTRA_LIB=$SQOOP_HOME/extra  
export PATH=$SQOOP_HOME/bin:$PATH

修改sqoop配置

在conf目录下的两个主要配置文件sqoop.properties和sqoop_bootstrap.properties
主要修改sqoop.properties

org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/opt/hadoop-2.7.3/etc/hadoop  
  
org.apache.sqoop.security.authentication.type=SIMPLE  
org.apache.sqoop.security.authentication.handler=org.apache.sqoop.security.authentication.SimpleAuthenticationHandler  
org.apache.sqoop.security.authentication.anonymous=true  

验证配置是否有效

jzhu:bin didi$ ./sqoop2-tool verify
Setting conf dir: /opt/sqoop-1.99.7/bin/../conf
Sqoop home directory: /opt/sqoop-1.99.7
Sqoop tool executor:
    Version: 1.99.7
    Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb
    Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine
Running tool: class org.apache.sqoop.tools.tool.VerifyTool
0    [main] INFO  org.apache.sqoop.core.SqoopServer  - Initializing Sqoop server.
12   [main] INFO  org.apache.sqoop.core.PropertiesConfigurationProvider  - Starting config file poller thread
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Verification was successful.
Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.
jjzhu:bin didi$ 

开启服务器

./bin/sqoop2-server start

jps

9505 SqoopJettyServer
....
相关实践学习
每个IT人都想学的“Web应用上云经典架构”实战
本实验从Web应用上云这个最基本的、最普遍的需求出发,帮助IT从业者们通过“阿里云Web应用上云解决方案”,了解一个企业级Web应用上云的常见架构,了解如何构建一个高可用、可扩展的企业级应用架构。
MySQL数据库入门学习
本课程通过最流行的开源数据库MySQL带你了解数据库的世界。 &nbsp; 相关的阿里云产品:云数据库RDS MySQL 版 阿里云关系型数据库RDS(Relational Database Service)是一种稳定可靠、可弹性伸缩的在线数据库服务,提供容灾、备份、恢复、迁移等方面的全套解决方案,彻底解决数据库运维的烦恼。 了解产品详情:&nbsp;https://www.aliyun.com/product/rds/mysql&nbsp;
目录
相关文章
|
21天前
|
前端开发 安全 测试技术
Postman Mac 版安装终极指南:从下载到流畅运行,一步到位
Postman 是 API 开发与测试的高效工具,支持各类 HTTP 请求调试与团队协作。本文详解 Mac 版下载、安装步骤,助你快速上手。同时推荐一体化 API 协作平台 Apifox,集文档、调试、测试于一体,提升开发效率与团队协同能力。
|
4月前
|
iOS开发 MacOS 索引
在不受支持的 Mac 上安装 macOS Tahoe 26
在不受支持的 Mac 上安装 macOS Tahoe 26
234 0
在不受支持的 Mac 上安装 macOS Tahoe 26
|
7月前
|
Ubuntu Linux Shell
Ubuntu gnome WhiteSur-gtk-theme类mac主题正确安装和卸载方式
通过这个过程,用户不仅可以定制自己的桌面外观,还可以学习到更多关于 Linux 系统管理的知识,从而更好地掌握系统配置和主题管理的技巧。
828 12
|
7月前
|
SQL 分布式计算 关系型数据库
基于云服务器的数仓搭建-hive/spark安装
本文介绍了在本地安装和配置MySQL、Hive及Spark的过程。主要内容包括: - **MySQL本地安装**:详细描述了内存占用情况及安装步骤,涉及安装脚本的编写与执行,以及连接MySQL的方法。 - **Hive安装**:涵盖了从上传压缩包到配置环境变量的全过程,并解释了如何将Hive元数据存储配置到MySQL中。 - **Hive与Spark集成**:说明了如何安装Spark并将其与Hive集成,确保Hive任务由Spark执行,同时解决了依赖冲突问题。 - **常见问题及解决方法**:列举了安装过程中可能遇到的问题及其解决方案,如内存配置不足、节点间通信问题等。
基于云服务器的数仓搭建-hive/spark安装
|
7月前
|
监控 Shell Linux
Android调试终极指南:ADB安装+多设备连接+ANR日志抓取全流程解析,覆盖环境变量配置/多设备调试/ANR日志分析全流程,附Win/Mac/Linux三平台解决方案
ADB(Android Debug Bridge)是安卓开发中的重要工具,用于连接电脑与安卓设备,实现文件传输、应用管理、日志抓取等功能。本文介绍了 ADB 的基本概念、安装配置及常用命令。包括:1) 基本命令如 `adb version` 和 `adb devices`;2) 权限操作如 `adb root` 和 `adb shell`;3) APK 操作如安装、卸载应用;4) 文件传输如 `adb push` 和 `adb pull`;5) 日志记录如 `adb logcat`;6) 系统信息获取如屏幕截图和录屏。通过这些功能,用户可高效调试和管理安卓设备。
|
11月前
|
开发工具 git 开发者
「Mac畅玩鸿蒙与硬件3」鸿蒙开发环境配置篇3 - DevEco Studio插件安装与配置
本篇将专注于如何在 DevEco Studio 中安装和配置必要的插件,以增强开发功能和提升效率。通过正确配置插件,开发流程能够得到简化,开发体验也会更加顺畅。
405 1
「Mac畅玩鸿蒙与硬件3」鸿蒙开发环境配置篇3 - DevEco Studio插件安装与配置
|
11月前
|
开发工具 iOS开发 开发者
「Mac畅玩鸿蒙与硬件2」鸿蒙开发环境配置篇2 - 在Mac上安装DevEco Studio
本篇将专注于如何在 Mac 上安装鸿蒙开发工具 DevEco Studio,确保开发环境能够顺利搭建。完成安装后,可以正式开始鸿蒙应用的开发工作。
681 1
「Mac畅玩鸿蒙与硬件2」鸿蒙开发环境配置篇2 - 在Mac上安装DevEco Studio
|
12月前
|
NoSQL Shell MongoDB
Mac OSX 平台安装 MongoDB
10月更文挑战第11天
227 4
|
12月前
|
SQL 分布式计算 Hadoop
手把手的教你搭建hadoop、hive
手把手的教你搭建hadoop、hive
809 1

相关实验场景

更多