CodeSample小助手 2019-12-30
可通过测试 teragen 和 terasort,来检测配置是否生效。
[hdfs@hdp-master ~]$ hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar teragen -Dmapred.map.tasks=100 10995116 oss://{bucket-name}/1G-input
18/10/28 21:32:38 INFO client.RMProxy: Connecting to ResourceManager at cdh-master/192.168.0.161:8050
18/10/28 21:32:38 INFO client.AHSProxy: Connecting to Application History server at cdh-master/192.168.0.161:10200
18/10/28 21:32:38 INFO aliyun.oss: [Server]Unable to execute HTTP request: Not Found
[ErrorCode]: NoSuchKey
[RequestId]: 5BD5BA7641FCE369BC1D052C
[HostId]: null
18/10/28 21:32:38 INFO aliyun.oss: [Server]Unable to execute HTTP request: Not Found
[ErrorCode]: NoSuchKey
[RequestId]: 5BD5BA7641FCE369BC1D052F
[HostId]: null
18/10/28 21:32:39 INFO terasort.TeraSort: Generating 10995116 using 100
18/10/28 21:32:39 INFO mapreduce.JobSubmitter: number of splits:100
18/10/28 21:32:39 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1540728986531_0005
18/10/28 21:32:39 INFO impl.YarnClientImpl: Submitted application application_1540728986531_0005
18/10/28 21:32:39 INFO mapreduce.Job: The url to track the job: http://cdh-master:8088/proxy/application_1540728986531_0005/
18/10/28 21:32:39 INFO mapreduce.Job: Running job: job_1540728986531_0005
18/10/28 21:32:49 INFO mapreduce.Job: Job job_1540728986531_0005 running in uber mode : false
18/10/28 21:32:49 INFO mapreduce.Job: map 0% reduce 0%
18/10/28 21:32:55 INFO mapreduce.Job: map 1% reduce 0%
18/10/28 21:32:57 INFO mapreduce.Job: map 2% reduce 0%
18/10/28 21:32:58 INFO mapreduce.Job: map 4% reduce 0%
...
18/10/28 21:34:40 INFO mapreduce.Job: map 99% reduce 0%
18/10/28 21:34:42 INFO mapreduce.Job: map 100% reduce 0%
18/10/28 21:35:15 INFO mapreduce.Job: Job job_1540728986531_0005 completed successfully
18/10/28 21:35:15 INFO mapreduce.Job: Counters: 36
...
[hdfs@hdp-master ~]$ hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar terasort -Dmapred.map.tasks=100 oss://{bucket-name}/1G-input oss://{bucket-name}/1G-output
18/10/28 21:39:00 INFO terasort.TeraSort: starting
...
18/10/28 21:39:02 INFO mapreduce.JobSubmitter: number of splits:100
18/10/28 21:39:02 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1540728986531_0006
18/10/28 21:39:02 INFO impl.YarnClientImpl: Submitted application application_1540728986531_0006
18/10/28 21:39:02 INFO mapreduce.Job: The url to track the job: http://cdh-master:8088/proxy/application_1540728986531_0006/
18/10/28 21:39:02 INFO mapreduce.Job: Running job: job_1540728986531_0006
18/10/28 21:39:09 INFO mapreduce.Job: Job job_1540728986531_0006 running in uber mode : false
18/10/28 21:39:09 INFO mapreduce.Job: map 0% reduce 0%
18/10/28 21:39:17 INFO mapreduce.Job: map 1% reduce 0%
18/10/28 21:39:19 INFO mapreduce.Job: map 2% reduce 0%
18/10/28 21:39:20 INFO mapreduce.Job: map 3% reduce 0%
...
18/10/28 21:42:50 INFO mapreduce.Job: map 100% reduce 75%
18/10/28 21:42:53 INFO mapreduce.Job: map 100% reduce 80%
18/10/28 21:42:56 INFO mapreduce.Job: map 100% reduce 86%
18/10/28 21:42:59 INFO mapreduce.Job: map 100% reduce 92%
18/10/28 21:43:02 INFO mapreduce.Job: map 100% reduce 98%
18/10/28 21:43:05 INFO mapreduce.Job: map 100% reduce 100%
^@18/10/28 21:43:56 INFO mapreduce.Job: Job job_1540728986531_0006 completed successfully
18/10/28 21:43:56 INFO mapreduce.Job: Counters: 54
...
测试成功,配置生效。
关于 Hadoop 更多内容,可参考:Hadoop 支持集成 OSS。
您也可以通过阿里云 EMR 访问 OSS。阿里云 EMR 基于开源生态,包括 Hadoop、Spark、Kafka、Flink、Storm 等组件,为您提供集群、作业、数据管理等服务的一站式企业大数据平台,并无缝支持 OSS。阿里云 EMR 与 OSS 紧密结合,针对开源生态访问 OSS,有多项技术优化,详情可参考 EMR 产品介绍。