Hadoop hdfs配置(版本2.7)
hadoop-env.sh
export JAVA_HOME=/home/java/jdk1.8.0_45
hdfs-site.xml
<name>dfs.nameservices</name>
<value>guanjian</value>
<name>dfs.ha.namenodes.guanjian</name>
<value>nn1,nn2</value>
<name>dfs.namenode.rpc-address.guanjian.nn1</name>
<value>host1:8020</value>
<name>dfs.namenode.rpc-address.guanjian.nn2</name>
<value>host2:8020</value>
<name>dfs.namenode.http-address.guanjian.nn1</name>
<value>host1:50070</value>
<name>dfs.namenode.http-address.guanjian.nn2</name>
<value>host2:50070</value>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://host1:8485;host2:8485/guanjian</value>
<name>dfs.client.failover.proxy.provider.guanjian</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_dsa</value>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/jn/data</value>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
core-site.xml
<name>fs.defaultFS</name>
<value>hdfs://guanjian</value>
<name>ha.zookeeper.quorum</name>
<value>192.168.5.129:2181</value>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop2</value>
slaves
host1
host2
在/etc/hosts中,host1,host2分别制定为
192.168.5.129 host1
192.168.5.182 host2
手动建两个文件夹
mkdir -p /opt/jn/data
mkdir /opt/hadoop2
在sbin目录下启动journalnode
./hadoop-daemon.sh start journalnode
格式化namenode,在bin目录下
./hdfs namenode -format
同机启动namenode,在/bin
./hadoop-daemon.sh start namenode
在没有格式化的机器上,在/bin
./hdfs namenode -bootstrapStandby
停止所有的dfs,在/sbin
./stop-dfs.sh
格式化zkfc,在/bin
./hdfs zkfc -formatZK
进入zookeeper查看
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, hadoop-ha, guanjian]
我们可以看到多了一个hadoop-ha节点
一次性启动全部hdfs,在/sbin
./start-dfs.sh
访问192.168.5.182:50070(active)
访问192.168.5.129:50070(standby)
创建目录,在/bin
./hdfs dfs -mkdir -p /usr/file
上传文件,在/bin
./hdfs dfs -put /home/soft/jdk-8u45-linux-x64.tar.gz /usr/file
点击jdk-XXX.tar.gz可以看到它有2个Block(1个Block128M)
Spark配置(版本2.2.0)
spark-env.sh
export JAVA_HOME=/home/java/jdk1.8.0_45
export SPARK_MASTER_HOST=192.168.5.182
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=192.168.5.129:2181 -Dspark.deploy.zookeeper.dir=/spark"
export SPARK_MASTER_PORT=7077
slaves
host1
host2
修改Web端口,/sbin下
start-master.sh
if [ "$SPARK_MASTER_WEBUI_PORT" = "" ]; then
SPARK_MASTER_WEBUI_PORT=8091 //原始端口8080,容易与其他冲突
fi
在其中一台启动,如在host2启动,/sbin下
./start-all.sh
在另外一台host1启动master,/sbin下
./start-master.sh
host2:alive
host1:standby
进入zookeeper查看,多了一个spark节点
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, spark, hadoop-ha, guanjian]