前言
Spark 自从2.3版本以来就支持运行在k8s上,本文主要介绍如何运行Spark在阿里云容器服务-Kubernetes。
前提条件
1、 已经购买阿里云容器服务-Kubernetes。购买链接:Kubernetes控制台。本例k8s集群类型为:Kubernetes 托管版。
2、 Spark镜像已构建。本例构建Spark的镜像的Dokerfile内容为:
# 基础镜像
FROM registry.cn-beijing.aliyuncs.com/acs/spark:v2.4.0
# 作者
LABEL maintainer "guangcheng.zgc@alibaba-inc.com"
#拷贝jar包到制定目录
COPY ./spark-examples-0.0.1-SNAPSHOT.jar /opt/spark/examples/jars/
镜像构建完后的registry的地址为:registry.cn-beijing.aliyuncs.com/bill_test/image_test:v2.4.0
3、 本例通过k8s 客户端提交yaml文件启动Spark任务。
下面详情介绍下每个步骤
制作Spark镜像
制作Spark镜像需要先在本地安装Docker,本例介绍Mac的安装方法。在Mac中执行:
brew cask install docker
安装完毕后执行docker命令显示如下内容说明安装成功。
Usage: docker [OPTIONS] COMMAND
A self-sufficient runtime for containers
Options:
--config string Location of client config files (default "/Users/bill.zhou/.docker")
-D, --debug Enable debug mode
-H, --host list Daemon socket(s) to connect to
-l, --log-level string Set the logging level ("debug"|"info"|"warn"|"error"|"fatal") (default "info")
--tls Use TLS; implied by --tlsverify
--tlscacert string Trust certs signed only by this CA (default "/Users/bill.zhou/.docker/ca.pem")
--tlscert string Path to TLS certificate file (default "/Users/bill.zhou/.docker/cert.pem")
--tlskey string Path to TLS key file (default "/Users/bill.zhou/.docker/key.pem")
--tlsverify Use TLS and verify the remote
-v, --version Print version information and quit
制作docker镜像需要编写Dockerfile,本例的Dockerfile创建过程如下。
#进入目录:
cd /Users/bill.zhou/dockertest
#拷贝测试jar包到此目录:
cp /Users/jars/spark-examples-0.0.1-SNAPSHOT.jar ./
#创建文件Dockerfile
vi Dockerfile
在Dockerfile文件中输入如下内容:
# 基础镜像
FROM registry.cn-beijing.aliyuncs.com/acs/spark:v2.4.0
# 作者
LABEL maintainer "guangcheng.zgc@alibaba-inc.com"
#拷贝jar包到制定目录
COPY ./spark-examples-0.0.1-SNAPSHOT.jar /opt/spark/examples/jars/
本例镜像引用了别人的基础镜像:registry.cn-beijing.aliyuncs.com/acs/spark:v2.4.0,然加入了自己的测试代码jar包:spark-examples-0.0.1-SNAPSHOT.jar。
Dockerfile编写完毕后,开始构建镜像,命令如下:
docker build /Users/bill.zhou/dockertest/ -t registry.cn-beijing.aliyuncs.com/bill_test/image_test:v2.4.0
构建完毕后需要上传镜像到registry,命令如下:
#先登录自己的阿里云账号
docker login --username=zhouguangcheng007@aliyun.com registry.cn-beijing.aliyuncs.com
#上传镜像
docker push registry.cn-beijing.aliyuncs.com/bill_test/image_test:v2.4.0
镜像制作完毕后可以开始试用Spark镜像。
提交任务到k8s集群
本例通过k8s客户端kubectl提交yaml到k8s。
首先购买一次ECS(和k8s在同一个vpc下),安装k8s客户端kubectl。安装指导参考:安装k8s指导。
安装完毕后配置集群的凭证后就可以访问k8s集群了。集群凭证配置方法,进入k8s集群“基本信息”页面获取凭证信息,如下图:
然后参考如下步骤提交spark任务:
## 安装crd
kubectl apply -f manifest/spark-operator-crds.yaml
## 安装operator的服务账号与授权策略
kubectl apply -f manifest/spark-operator-rbac.yaml
## 安装spark任务的服务账号与授权策略
kubectl apply -f manifest/spark-rbac.yaml
## 安装spark-on-k8s-operator
kubectl apply -f manifest/spark-operator.yaml
## 下发spark-pi任务
kubectl apply -f examples/spark-pi.yaml
对应的文件参可以从开源社区下载最新版本。
运行完毕后可以通过命令查看运行日志。如下:
#查看pod -n指定命名空间
kubectl get pod -n spark-operator
#查看pod 日志
kubectl log spark-pi-driver -n spark-operator
看下如下内容说明执行成功:
2019-07-23 11:55:54 INFO SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1161
2019-07-23 11:55:54 INFO DAGScheduler:54 - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:17) (first 15 tasks are for partitions Vector(0, 1))
2019-07-23 11:55:54 INFO TaskSchedulerImpl:54 - Adding task set 0.0 with 2 tasks
2019-07-23 11:55:55 INFO TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, 172.20.1.9, executor 1, partition 0, PROCESS_LOCAL, 7878 bytes)
2019-07-23 11:55:55 INFO TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, 172.20.1.9, executor 1, partition 1, PROCESS_LOCAL, 7878 bytes)
2019-07-23 11:55:57 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 172.20.1.9:36662 (size: 1256.0 B, free: 117.0 MB)
2019-07-23 11:55:57 INFO TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 2493 ms on 172.20.1.9 (executor 1) (1/2)
2019-07-23 11:55:57 INFO TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 2789 ms on 172.20.1.9 (executor 1) (2/2)
2019-07-23 11:55:57 INFO TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool
2019-07-23 11:55:58 INFO DAGScheduler:54 - ResultStage 0 (reduce at SparkPi.scala:21) finished in 5.393 s
2019-07-23 11:55:58 INFO DAGScheduler:54 - Job 0 finished: reduce at SparkPi.scala:21, took 6.501405 s
**Pi is roughly 3.142955714778574**
2019-07-23 11:55:58 INFO AbstractConnector:318 - Stopped Spark@49096b06{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-07-23 11:55:58 INFO SparkUI:54 - Stopped Spark web UI at http://spark-test-1563882878789-driver-svc.spark-operator-t01.svc:4040
2019-07-23 11:55:58 INFO KubernetesClusterSchedulerBackend:54 - Shutting down all executors
2019-07-23 11:55:58 INFO KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint:54 - Asking each executor to shut down
2019-07-23 11:55:58 WARN ExecutorPodsWatchSnapshotSource:87 - Kubernetes client has been closed (this is expected if the application is shutting down.)
2019-07-23 11:55:59 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2019-07-23 11:55:59 INFO MemoryStore:54 - MemoryStore cleared
2019-07-23 11:55:59 INFO BlockManager:54 - BlockManager stopped
2019-07-23 11:55:59 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2019-07-23 11:55:59 INFO OutputCommitCoordinator$
最后看下spark-pi.yaml文件内容的关键信息。
apiVersion: "sparkoperator.k8s.io/v1beta1"
kind: SparkApplication
metadata:
name: spark-pi
namespace: spark-operator
spec:
type: Scala
mode: cluster
image: "registry.cn-beijing.aliyuncs.com/acs/spark:v2.4.0" #运行的镜像registry路径。
imagePullPolicy: Always
mainClass: org.apache.spark.examples.SparkPi #运行的入口类。
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar" #运行的类的相关jar包,这个路径是镜像中的路径。
sparkVersion: "2.4.0"
restartPolicy:
type: Never
volumes:
- name: "test-volume"
hostPath:
path: "/tmp"
type: Directory
#定义spark driver端的资源
driver:
cores: 0.1
coreLimit: "200m"
memory: "512m"
labels:
version: 2.4.0
serviceAccount: spark
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
#定义executor 端的资源
executor
cores: 1
instances: 1
memory: "512m"
labels:
version: 2.4.0
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
yaml是提交到k8s文件标准格式,yaml中定义了所需的镜像,Spark driver和executor端的资源。
总结
k8s容器介绍请参考:容器服务Kubernetes版。
其它spark on k8s请参考:
Spark in action on Kubernetes - Playground搭建与架构浅析
Spark in action on Kubernetes - Spark Operator的原理解析。