分布式安装(至少三台主机):
环境所需软件:
CentOS7
hadoop-2.7.3.tar.gz
jdk-8u102-linux-x64.tar.gz

安装前准备工作:

  1. 修改 /etc/hosts 文件
    vim /etc/hosts
    内容:
    192.168.10.11 bigdata1
    192.168.10.12 bigdata2
    192.168.10.13 bigdata3
  2. 配置免密钥登陆
    cd
    ssh-keygen -t rsa
    一直回车,直到结束

                 ssh-copy-id   .ssh/id_rsa.pub    bigdata1
                 ssh-copy-id   .ssh/id_rsa.pub    bigdata2
                 ssh-copy-id   .ssh/id_rsa.pub    bigdata3
  3. 同步时间
    通过设置计划任务实现各主机间的时间同步
    vim /etc/crontab
    0 0 1 root ntpdate -s time.windows.com

                或者部署一个时间服务器实现同步,这里就不详细讲解了
    
        (*)hdfs-site.xml
    
            <!--数据块的冗余度,默认是3-->
            <property>
              <name>dfs.replication</name>
              <value>2</value>
            </property>
    
            <!--是否开启HDFS的权限检查,默认:true-->
            <!--
            <property>
              <name>dfs.permissions</name>
              <value>false</value>
            </property>
            -->
    
        core-site.xml
            <!--NameNode的地址-->
            <property>
              <name>fs.defaultFS</name>
              <value>hdfs://bigdata1:9000</value>
            </property> 
    
            <!--HDFS数据保存的目录,默认是Linux的tmp目录-->
            <property>
              <name>hadoop.tmp.dir</name>
              <value>/root/training/hadoop-2.7.3/tmp</value>
            </property> 
    
        mapred-site.xml
            <!--MR程序运行的容器是Yarn-->
            <property>
              <name>mapreduce.framework.name</name>
              <value>yarn</value>
            </property>     
    
        yarn-site.xml
            <!--ResourceManager的地址-->
            <property>
              <name>yarn.resourcemanager.hostname</name>
              <value>bigdata1</value>
            </property>     
    
            <!--NodeManager运行MR任务的方式-->
            <property>
              <name>yarn.nodemanager.aux-services</name>
              <value>mapreduce_shuffle</value>
            </property> 
    
        对NameNode进行格式化: hdfs namenode -format
                         日志:Storage directory /root/training/hadoop-2.7.3/tmp/dfs/name has been successfully formatted.
    
        scp -r  /root/training/hadoop-2.7.3   bigdata2:/root/training/hadoop-2.7.3
        scp -r  /root/training/hadoop-2.7.3   bigdata3:/root/training/hadoop-2.7.3
    
        启动:start-all.sh = start-dfs.sh + start-yarn.sh
    
                    验证 
    (*)命令行:hdfs dfsadmin -report
    (*)网页:HDFS:http://192.168.157.12:50070/
               Yarn:http://192.168.157.12:8088
    
    (*)Demo:测试MapReduce程序
               example: /root/training/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar
               hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /input/data.txt /output/wc1204