Spark + OSS
Spark 接入 OSS
当前E-MapReduce支持
MetaService服务,支持用户在E-MapReduce环境免AK访问OSS数据源。旧的显式写AK和Endpoint方式也支持,但需要注意OSS Endpoint请使用内网域名,所有的Endpoint可以参考
OSS Endpoint。
下面这个例子演示了Spark如何免AK从OSS中读入数据,并将处理完的数据写回到OSS 中。
- [backcolor=transparent] val conf [backcolor=transparent]=[backcolor=transparent] [backcolor=transparent]new[backcolor=transparent] [backcolor=transparent]SparkConf[backcolor=transparent]().[backcolor=transparent]setAppName[backcolor=transparent]([backcolor=transparent]"Test OSS"[backcolor=transparent])
- [backcolor=transparent] val sc [backcolor=transparent]=[backcolor=transparent] [backcolor=transparent]new[backcolor=transparent] [backcolor=transparent]SparkContext[backcolor=transparent]([backcolor=transparent]conf[backcolor=transparent])
- [backcolor=transparent] val pathIn [backcolor=transparent]=[backcolor=transparent] [backcolor=transparent]"oss://bucket/path/to/read"
- [backcolor=transparent] val inputData [backcolor=transparent]=[backcolor=transparent] sc[backcolor=transparent].[backcolor=transparent]textFile[backcolor=transparent]([backcolor=transparent]pathIn[backcolor=transparent])
- [backcolor=transparent] val cnt [backcolor=transparent]=[backcolor=transparent] inputData[backcolor=transparent].[backcolor=transparent]count
- [backcolor=transparent] println[backcolor=transparent]([backcolor=transparent]s[backcolor=transparent]"count: $cnt"[backcolor=transparent])
- [backcolor=transparent] val outputPath [backcolor=transparent]=[backcolor=transparent] [backcolor=transparent]"oss://bucket/path/to/write"
- [backcolor=transparent] val outpuData [backcolor=transparent]=[backcolor=transparent] inputData[backcolor=transparent].[backcolor=transparent]map[backcolor=transparent]([backcolor=transparent]e [backcolor=transparent]=>[backcolor=transparent] s[backcolor=transparent]"$e has been processed."[backcolor=transparent])
- [backcolor=transparent] outpuData[backcolor=transparent].[backcolor=transparent]saveAsTextFile[backcolor=transparent]([backcolor=transparent]outputPath[backcolor=transparent])
附录
示例代码请看: