开发者学堂课程【大数据实时计算框架 Spark 快速入门:Spark textFile 和排序-2】学习笔记,与课程紧密联系,让用户快速学习知识。
课程地址:https://developer.aliyun.com/learning/course/100/detail/1695
Spark textFile 和排序-2
内容简介:
一、相关代码
二、使用 Browse Directory
一、相关代码
1 package com. shsxt. study. core;
2
3 * import java.util.ArrayList;
17
18 public class GroupTopN {
19
20 public static void main(String[] args){
21 SparkConf conf = new SparkConf().setAppName("GroupTopN"). setMaster("local");
22 JavaSparkContext sc =new JavaSparkContext(conf);
23
24 JavaRDD lines = sc.textFile("score.txt");
25 JavaPairRDD pairs = lines
26 .mapToPair(new PairFunction() {
27
28 private static final long serialVersionUID =1L;
29
30 @Override
31 public Tuple2 call(String line)
32 throws Exception {
33 String[] arr = line.split("");
34 return new Tuple2(arr[0], Integer
35 .valueOf(arr[1]));
36 }
37 })
38 JavaPairRDD groupedPairs = pairs
39 . groupByKey();
40 JavaPairRDD top2score = groupedPairs
41 .mapToPair(new PairFunction>, String, Iterable>(){
42
43 private static final long serialVersionUID =1L;
44
45 @Override
46 public Tuple2> call(
47 Tuple2 tuple)
48 throws Exception
49 List list = new ArrayList();
50 Iterable scores . tuple. _2;
51 Iterator it =scores. iterator();
52 while (it.hasNext())
53 Integer score =it.next();
54 list.add(score);
55 }
56 Collections. sort(list, new Comparator(){
57 @override
58 public int compare(Integer ol, Integer o2){
59 return -(o1-o2);
60 }
61 });
62 list = list.sublist(0, 2);
63 return new Tuple2(tuple. _1,
64 list);
二、使用 Browse Directory