MongoDB · Feature · In-place update in MongoDB

简介: There is a great new feature in the release note of MongoDB 3.5.12. Faster In-place Updates in WiredTiger This work brings improvements to in-p...

There is a great new feature in the release note of MongoDB 3.5.12.

Faster In-place Updates in WiredTiger

This work brings improvements to in-place update workloads for users running the WiredTiger engine, especially for updates to large documents. Some workloads may see a reduction of up to 7x in disk utilization (from 24 MB/s to 3 MB/s) as well as a 20% improvement in throughput.

I thought wiredtiger has impeletementd the delta page feature introduced in the bw-tree paper, that is, writing pages that are deltas from previously written pages. But after I read the source code, I found it’s a totally diffirent idea, in-place update only impacted the in-meomry and journal format, the on disk layout of data is not changed.

I will explain the core of the in-place update implementation.

MongoDB introduced mutable bson to descirbe document update as incremental(delta) update.

Mutable BSON provides classes to facilitate the manipulation of existing BSON objects or the construction of new BSON objects from scratch in an incremental fashion.

Suppose you have a very large document, see 1MB

{
   _id: ObjectId("59097118be4a61d87415cd15"),
   name: "ZhangYoudong",
   birthday: "xxxx",
   fightvalue: 100,
   xxx: .... // many other fields
}

If the fightvalue is changed from 100 to 101, you can use a DamageEvent to describe the update, it just tells you the offset、size、content(kept in another array) of the change.

struct DamageEvent {
    typedef uint32_t OffsetSizeType;
    // Offset of source data (in some buffer held elsewhere).
    OffsetSizeType sourceOffset;

    // Offset of target data (in some buffer held elsewhere).
    OffsetSizeType targetOffset;

    // Size of the damage region.
    size_t size;
};

So if you have many small changes for a document, you will have DamageEvent array, MongoDB add a new storage interface to support inserting DamageEvent array (DamageVector).

bool WiredTigerRecordStore::updateWithDamagesSupported() const {
    return true;
}

StatusWith<RecordData> WiredTigerRecordStore::updateWithDamages(
    OperationContext* opCtx,
    const RecordId& id,
    const RecordData& oldRec,
    const char* damageSource,
    const mutablebson::DamageVector& damages) {

}

WiredTiger added a new update type called WT_UPDATE_MODIFIED to support MongoDB, when a WT_UPDATE_MODIFIED update happened, wiredTiger first logged a change list which is transformed from DamageVector into journal, then kept the change list in memory associated with the original record.

When the record is read, wiredTiger will first read the original record, then apply every operation in change list, returned the final record to the client.

So the core for in-place update:

  1. WiredTiger support delta update in memory and journal, so the IO of writing journal will be greatly reduced for large document.
  2. WiredTiger’s data layout is kept unchanged, so the IO of writing data is not changed.
目录
相关文章
|
存储 SQL NoSQL
MongoDB的基本命令(insert、delete、find、update)
MongoDB的基本命令(insert、delete、find、update)
MongoDB的基本命令(insert、delete、find、update)
|
存储 SQL NoSQL
MongoDB:2-MongoDB的基本命令(insert、delete、find、update)
MongoDB:2-MongoDB的基本命令(insert、delete、find、update)
|
NoSQL 索引 MongoDB
MongoDB数据update的坑
统计mongodb慢查询的时候,发现有的集合慢查询很多,然后通知开发看一下字段加索引, 和开发讨论之后加唯一索引,加的时候发现有重复数据,然后用聚合命令统计了一下24w的数据有10w+的重复数据, 开发说update操作的时候加了{upsert:true},应该是查询不到新增一条,不会有重复数据, 然后查看mongodb的官方文档查看db.
2946 0
|
NoSQL JavaScript 前端开发
|
NoSQL MongoDB 存储
[MongoDB]Update更新数据
版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/SunnyYoona/article/details/52420210 Update操作只作用于集合中存在的文档。
1946 0

相关产品

  • 云数据库 MongoDB 版
  • 推荐镜像

    更多