Spark3.X on Yarn安装配置
一、解压
1. 将Spark包解压到路径/opt/module路径中
tar -zxvf /opt/software/spark-3.1.1-bin-hadoop3.2.tgz -C /opt/module/
2. 改名(可不做)
mv spark-3.1.1-bin-hadoop3.2/ spark-3.1.1-yarn
二、配置
1. 环境变量
vi /etc/profile
添加:
#SPARK_HOME
export SPARK_HOME=/opt/module/spark-3.1.1-yarn
export PATH=$PATH:$SPARK_HOME/bin
使环境变量生效:source /etc/profile
在/opt目录下使用命令:spark-submit --version
查看结果
三、任务
完成on yarn相关配置
- 复制spark-defaults.conf.template改名为spark-defaults.conf:
cp spark-defaults.conf.template spark-defaults.conf
添加内容:
spark.eventLog.enabled true
spark.eventLog.dir hdfs://master:9000/directory
- 复制spark-env.sh.template改名为spark-env.sh:
cp spark-env.sh.template spark-env.sh
添加内容:
export JAVA_HOME=/opt/module/jdk1.8.0_212
export HADOOP_CONF_DIR=/opt/module/hadoop-3.1.3
export YARN_CONF_DIR=/opt/module/hadoop-3.1.3/etc/hadoop
- 复制workers.template改名为workers:
cp workers.template workers
添加内容:
master
slave1
slave2
- 修改hadoop里面的yarn-site.xml:
vi /opt/module/hadoop-3.1.3/etc/hadoop/yarn-site.xml
添加内容:
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
- 分发:
将环境变量,spark安装包,yarn-site.xml文件分发给slave1和2并使环境变量生效:
scp -r /etc/profile root@slave1:/etc/profile
scp -r /etc/profile root@slave2:/etc/profile
scp -r /opt/module/spark-3.1.1-yarn/ root@slave1:/opt/module/
scp -r /opt/module/spark-3.1.1-yarn/ root@slave2:/opt/module/
scp -r /opt/module/hadoop-3.1.3/etc/hadoop/yarn-site.xml root@slave1:/opt/module/hadoop-3.1.3/etc/hadoop/
scp -r /opt/module/hadoop-3.1.3/etc/hadoop/yarn-site.xml root@slave2:/opt/module/hadoop-3.1.3/etc/hadoop/
- 使用spark on yarn 的模式提交$SPARK_HOME/examples/jars/spark-examples_2.12-3.1.1.jar 运行的主类为org.apache.spark.examples.SparkPi
运行命令:spark-submit --master yarn --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/jars/spark-examples_2.12-3.1.1.jar
结果: