MongoDB分组怎么用?
官网:https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/
不指定字段分组
案例1:
db.getCollection('6117Decartes').aggregate( [ { $group : { _id : null, "count":{$sum:1} } } ], { allowDiskUse: true } );
要注意的是,字段可被统计才行,否则会报错,看个例子。
代码 InstitutionID 是非统计字段
db.getCollection('6117Decartes').aggregate( [ { $group : { _id : null, "InstitutionID":"$InstitutionID" } } ], { allowDiskUse: true } ); 正确的写法是: db.getCollection('6117Decartes').aggregate( [ { $group : { _id : null, "InstitutionID":{$max:"$InstitutionID"} } } ], { allowDiskUse: true } );
否则报错如下:
案例2:
db.sales.insert([ { "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2, "date" : ISODate("2014-03-01T08:00:00Z") }, { "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1, "date" : ISODate("2014-03-01T09:00:00Z") }, { "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-03-15T09:00:00Z") }, { "_id" : 4, "item" : "xyz", "price" : 5, "quantity" : 20, "date" : ISODate("2014-04-04T11:21:39.736Z") }, { "_id" : 5, "item" : "abc", "price" : 10, "quantity" : 10, "date" : ISODate("2014-04-04T21:23:13.331Z") } ]);
不指定数据分组
db.sales.aggregate( [ { $group : { _id : null, minPrice:{$min:"$price"}, //最小值 maxPrice:{$max:"$price"}, //最大值 avgPrice:{ $avg: "$price" }, //平均值 totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } }, //总价 $sum(price * quantity) lastItem:{$last:"$item"}, //取最后一条 count: { $sum: 1 } // 总条目数 } } ] );
多字段分组
db.getCollection('6117Decartes').aggregate( [ { $group : { _id : {"InstitutionID":"$InstitutionID","logtype":"$logtype"}, "总条目数":{$sum:1} } } ], { allowDiskUse: true } );
案例2:
db.sales.aggregate( [ { $group : { _id : {"商品":"$item", "价格":"$price"}, "总条目数":{$sum:1} } } ] );
将 $push 聚合分组指定到列结构到数组中
db.books.insert([ { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }, { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }, { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }, { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }, { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 } ]); db.books.aggregate( [ { $group : { _id : "$author", books: { $push: "$title" } } } ] );
可能出现的问题:分组内存使用超过限制时错误
{ "message" : "Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.", "stack" : "MongoError: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." + "at queryCallback (/tmp/.mount_nosqlbO7RhZG/app/resources/app.asar/node_modules/mongodb-core/lib/cursor.js:223:25)" + "at /tmp/.mount_nosqlbO7RhZG/app/resources/app.asar/node_modules/mongodb-core/lib/connection/pool.js:541:18" + "at _combinedTickCallback (internal/process/next_tick.js:131:7)" + "at process._tickCallback (internal/process/next_tick.js:180:9)", "name" : "MongoError", "ok" : 0, "errmsg" : "Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.", "code" : 16945, "codeName" : "Location16945" }
怎么解决?
- id 字段是必须要的,如果不指定字段进行分组则用 null,表示不分组的统计;
- 分组内存使用限制是100M,默认情况下如果超过了限制100M则会出现错误。如果想对超过100M的大数据进行处理,可以使用 allowDiskUse 选项来进行分组时写到磁盘临时文件中处理。
db.getCollection('6117Decartes').aggregate( [ { $group : { _id : {"InstitutionID":"$InstitutionID","logtype":"$logtype"}, "总条目数":{$sum:1} } } ], { allowDiskUse: true } );