Groupby - collection processing

简介:

Iterator and Iterable have most of the most useful methods when dealing with collections. Fold, Map, Filter are probably the most common. But other very useful methods include grouped/groupBy, sliding, find, forall, foreach, and many more. I want to cover Iterable's groupBy method in this topic.

This is a Scala 2.8 and later method. It is similar to partition in that it allows the collection to be divided (or partitioned). Partition takes a method with returns a boolean and partitions the collection into two depending on a result. GroupBy takes a function that returns an object and returns a Map with the key being the return value. This allows an arbitrary number of partitions to be made from the collection.

Here is the method signature:

def groupBy[K](f : (A) => K) : Map[K, This]

A bit of context is require to understand the three Type parameters A, K and This. This method is defined in a super class of collections called TraversableLike (I will briefly discuss this in the next topic.) TraversableLike takes two type parameters: the type of the collection and the type contained in the collection. Therefore in this method definition, 'This' refers to the collection type (List for example) and A refers to contained type (perhaps Int). Finally K refers to the type returned by the function and are the keys of the groups formed by the method.

scala> val groups = (1 to 20).toList groupBy {
      case i if(i<5) => "g1"
      case i if(i<10) => "g2"
      case i if(i<15) => "g3"
      case _ => "g4"
      }

      res4: scala.collection.Map[java.lang.String,List[Int]] = Map(g1 -> List(1, 2, 3, 4), 
      g2 -> List(5, 6, 7, 8, 9), g3 -> List(10, 11, 12, 13, 14), g4 -> List(15, 16, 17, 18, 19, 20))

scala> val mods = (1 to 20).toList groupBy ( _ % 4 )

mods: scala.collection.Map[Int,List[Int]] = Map(1 -> List(1, 5, 9, 13, 17), 2 -> List(2, 6, 10, 14, 18), 
3 -> List(3, 7,11, 15, 19), 0 -> List(4, 8, 12, 16, 20))
目录
相关文章
|
11月前
Stream方法使用-filter、sorted、distinct、limit
Stream方法使用-filter、sorted、distinct、limit
70 0
|
3月前
|
Python
完美解决丨+# TypeError: ‘dict_keys‘ object does not support indexing
完美解决丨+# TypeError: ‘dict_keys‘ object does not support indexing
|
3月前
|
Scala
【已解决】Specifying keys via field positions is only valid for tuple data types. Type: GenericType<scala
【已解决】Specifying keys via field positions is only valid for tuple data types. Type: GenericType<scala
46 0
|
10月前
解决Mapped Statements collection already contains value for experiment4.UserMapper.listUser错误~
解决Mapped Statements collection already contains value for experiment4.UserMapper.listUser错误~
成功解决AttributeError: ‘Series‘ object has no attribute ‘columns‘
成功解决AttributeError: ‘Series‘ object has no attribute ‘columns‘
|
NoSQL MongoDB 数据库
DeprecationWarning: count is deprecated. Use Collection.count_documents instead
当我使用pymongo查询出对应的cursor(find出的document的迭代器),然后查看查询出数据的数量时使用如下代码: ```python db = MongoClient(host='192.168.1.3', port=27017) # dbname为操作的数据库名称,collectionname为操作的集合名称
317 0
Pandas报错AttributeError: Cannot access callable attribute 'sort_values' of 'DataFrameGroupBy' objects
Pandas报错AttributeError: Cannot access callable attribute 'sort_values' of 'DataFrameGroupBy' objects
Neither Quantity object nor its magnitude supports indexing
Neither Quantity object nor its magnitude supports indexing
成功解决AttributeError: ‘DataFrame‘ object has no attribute ‘tolist‘
成功解决AttributeError: ‘DataFrame‘ object has no attribute ‘tolist‘
|
C++
Data Structures and Algorithms (English) - 6-9 Sort Three Distinct Keys(20 分)
Data Structures and Algorithms (English) - 6-9 Sort Three Distinct Keys(20 分)
102 0