基于以下List集合实现词频统计
val list = List("hadoop spark hive ",""," hue spark hadoop hadoop","hue hive hive hive","spark hadoop hadoop")
实现词频统计,并按照单词个数降序排序,实现结果如下
hadoop-5 hive-4 spark-3 hue-2
val list = List("hadoop spark hive ",""," hue spark hadoop hadoop","hue hive hive hive","spark hadoop hadoop") // var m = Map[String, Int]() // readLine.trim.split(" ").foreach(i => if (m.contains(i)) m += (i -> (m(i) + 1)) else m += (i -> 1)) // val sorted = m.toSeq.sortWith(_._2 > _._2) // sorted.foreach(println) val unit = list.flatMap(x =>x.split(" ") //1.转化为List扁平化 1.切割 .filter(x =>x.trim.length!=0)) //2.过滤空字符及前后空格 2.分组 .groupBy(x => x) //3.一个个分组 3.排序 .mapValues(_.size) //4.取map的值 .toList //5.转换成List .sortBy(-_._2) //6.按次数排序 降序 .foreach(x => println(x)) //7.循环输出 println(unit.toString())```