Groupby - collection processing

简介:

Iterator and Iterable have most of the most useful methods when dealing with collections. Fold, Map, Filter are probably the most common. But other very useful methods include grouped/groupBy, sliding, find, forall, foreach, and many more. I want to cover Iterable's groupBy method in this topic.

This is a Scala 2.8 and later method. It is similar to partition in that it allows the collection to be divided (or partitioned). Partition takes a method with returns a boolean and partitions the collection into two depending on a result. GroupBy takes a function that returns an object and returns a Map with the key being the return value. This allows an arbitrary number of partitions to be made from the collection.

Here is the method signature:

def groupBy[K](f : (A) => K) : Map[K, This]
AI 代码解读

A bit of context is require to understand the three Type parameters A, K and This. This method is defined in a super class of collections called TraversableLike (I will briefly discuss this in the next topic.) TraversableLike takes two type parameters: the type of the collection and the type contained in the collection. Therefore in this method definition, 'This' refers to the collection type (List for example) and A refers to contained type (perhaps Int). Finally K refers to the type returned by the function and are the keys of the groups formed by the method.

scala> val groups = (1 to 20).toList groupBy {
      case i if(i<5) => "g1"
      case i if(i<10) => "g2"
      case i if(i<15) => "g3"
      case _ => "g4"
      }

      res4: scala.collection.Map[java.lang.String,List[Int]] = Map(g1 -> List(1, 2, 3, 4), 
      g2 -> List(5, 6, 7, 8, 9), g3 -> List(10, 11, 12, 13, 14), g4 -> List(15, 16, 17, 18, 19, 20))

scala> val mods = (1 to 20).toList groupBy ( _ % 4 )

mods: scala.collection.Map[Int,List[Int]] = Map(1 -> List(1, 5, 9, 13, 17), 2 -> List(2, 6, 10, 14, 18), 
3 -> List(3, 7,11, 15, 19), 0 -> List(4, 8, 12, 16, 20))
AI 代码解读
目录
打赏
0
0
0
0
20
分享
相关文章
Stream方法使用-filter、sorted、distinct、limit
Stream方法使用-filter、sorted、distinct、limit
197 0
成功解决: 加上 @Transient 仍然报 Unknown column ‘goods_list‘ in ‘field list‘
这篇文章讨论了在SpringBoot结合MyBatis-Plus框架中,当实体类中包含另一个实体类的集合,而这个集合字段在数据库中不存在时,如何避免由此引发的错误。文章提供了两种解决方法:一是使用`@TableField(exist = false)`注解明确指定该字段在数据库中不存在;二是使用`transient`关键字,但要注意`transient`关键字在Java中默认就是被忽略的,不需要加`@Transient`注解。文章最后展示了问题解决的效果。
|
10月前
|
db.oplog.rs.find({"ns": "your_database_name.your_collection_name", "o": {exists: true}}).sort({natural: -1}).limit(1) 这个SQL什么意思
【6月更文挑战第29天】db.oplog.rs.find({"ns": "your_database_name.your_collection_name", "o": {exists: true}}).sort({natural: -1}).limit(1) 这个SQL什么意思
100 8
|
11月前
|
db.oplog.rs.find({"ns": "your_database_name.your_collection_name", "o": {exists: true}}).sort({natural: -1}).limit(1)
【5月更文挑战第22天】db.oplog.rs.find({"ns": "your_database_name.your_collection_name", "o": {exists: true}}).sort({natural: -1}).limit(1) 的作用
69 6
解决which is not functionally dependent on columns in GROUP BY clause;...sql_mode=only_full_group_by
解决which is not functionally dependent on columns in GROUP BY clause;...sql_mode=only_full_group_by
391 0
解决 FAILED: UDFArgumentException explode() takes an array or a map as a parameter 并理解炸裂函数和侧视图
解决 FAILED: UDFArgumentException explode() takes an array or a map as a parameter 并理解炸裂函数和侧视图
129 0
Pandas pd.merge() 报错:ValueError: You are trying to merge on int64 and object columns.
Pandas pd.merge() 报错:ValueError: You are trying to merge on int64 and object columns.
Pandas pd.merge() 报错:ValueError: You are trying to merge on int64 and object columns.
LeetCode 303. Range Sum Query - Immutable
给定一个整数数组 nums,求出数组从索引 i 到 j (i ≤ j) 范围内元素的总和,包含 i, j 两点。
149 0
LeetCode 303. Range Sum Query - Immutable
LeetCode 304. Range Sum Query 2D - Immutable
给定一个二维矩阵,计算其子矩形范围内元素的总和,该子矩阵的左上角为 (row1, col1) ,右下角为 (row2, col2)。
141 0
LeetCode 304. Range Sum Query 2D - Immutable
1 of ORDER BY clause is not in GROUP BY clause and contains nonaggregated column 'information_schema.PROFILING.SEQ' which is not functionally dependent on columns in GROUP BY clause
1 of ORDER BY clause is not in GROUP BY clause and contains nonaggregated column 'information_schema.PROFILING.SEQ' which is not functionally dependent on columns in GROUP BY clause
226 0
1 of ORDER BY clause is not in GROUP BY clause and contains nonaggregated column 'information_schema.PROFILING.SEQ' which is not functionally dependent on columns in GROUP BY clause