High-availability MongoDB Cluster Configuration Solutions

本文涉及的产品
云数据库 MongoDB,独享型 2核8GB
推荐场景:
构建全方位客户视图
简介: In this post, we will share an in-depth discussion about High-availability Cluster Solutions along with several MongoDB High-availability Cluster Configurations.

Introduction

High Availability (HA) refers to the improvement of system and app availability by minimizing the downtime caused by routine maintenance operations (planned) and sudden system crashes (unplanned).
This post extensively talks about High-availability Cluster Solutions along with several MongoDB High-availability Cluster Configurations.

High-availability Cluster Solutions

The high availability of computer systems has different manifestations at different levels:

(1) Network high availability
Thanks to the rapid development of network storage, the network redundancy technology undergoes constant improvements. The key application for improving the high availability of IT systems lies in high network availability. The high availability and high reliability aspects of a network are different. We achieve high availability of a network through network equipment redundancy by matching redundant network equipment, such as redundant switches, routers and so on.

(2) Server high availability
Primarily, we achieve the high availability of servers using server cluster software or high availability software.

(3) Storage high availability
Storage high availability refers to achieving high availability of storage using software or hardware technologies. The main technical indicators are the storage switching, data replication, and data snapshot features. When a storage device fails, another spare storage device can quickly take over to continue the storage service without interruptions.

MongoDB High-availability Cluster Configuration

A shortened form of high-availability cluster is HA cluster.
A cluster is a set of computers that provide users with a set of network resources as a whole.
These individual computer systems are the nodes of the cluster.
Building a high-availability cluster requires a reasonable allotment of roles of multiple computers, as well as data recovery and consistency mechanisms. The primary methods are as follows:

(1) Master-slave Approach (Asymmetric)

When the master node is running, the slave node is in monitoring and preparation status; when the master node fails, the slave node takes over all the tasks of the master node. After the master node resumes normal services, the services switch back to the master node as per the user's settings-automatic or manual. Also, it ensures data consistency through the shared storage system.

(2) Dual-machine Running Approach (Dual Active Mode)

Two master nodes run their services at the same time and mutually monitor each other's status. When either master node fails, the other master node immediately takes over all tasks to ensure everything runs in real time. It stores key data of the application service system in the shared storage system.

(3) Cluster Running Approach (Multi-active mode)

Multiple master nodes work together, each running one or several services, and each defining one or more backup nodes for the services. When any master node fails, the master nodes takes over the services that were running on it.

The practices of MongoDB cluster configuration also follow these schemes, mainly involving the master-slave structure, replica set and sharding approaches.

Master-slave Structure

1

Generally, we use the master-slave architecture for backup or read-write splitting. The structure exists in two structure divisions, one-master-one-slave structure and one-master-multiple-slave structure.

There are primarily two roles:

(1) Master

The master node can both read and write data. When handling modified data, the op-log will synchronize the updates to all the connected slave nodes.

(2) Slave

The slave node can only read data, but not write data. It automatically synchronizes data from the master node.

We do not recommend MongoDB to use the master-slave architecture because it is impossible to recover the master node automatically after a failure. The replica set solution intended for introduction later would be ideal. The master-slave architecture remains unused unless the number of replica nodes exceeds 50. Normally, one need not use so many nodes.

In addition, the master-slave architecture does not support the chained structure. It is only possible to connect the slave node directly to the master node. Redis master-slave architecture supports the chained structure and allows us to connect the slave node to another slave node, hence the slave node of a slave node.

Replica Set

The replica set mechanism of MongoDB has two main purposes:

• One is for data redundancy for failure recovery. When the hardware fails, or the node is down for other reasons, you can use a replica for recovery.

• The other purpose is for read-write splitting. It routes the reading requests to the replica to reduce the reading pressure on the primary node.

1.Replica set with the primary and secondary nodes

2

The replica set is a set of MongoDB instances which share the same data content. It primarily involves three roles:

(a) Primary node

The primary node receives all the write requests, and then synchronizes the changes to all the secondary nodes. A replica set can only have one primary node. When the primary fails, the other secondary nodes or the arbiter node will re-elect a primary node. The primary node receives read requests for processing by default. If you want to forward the read requests to a secondary node, you need to modify the connection configuration on the client.

(b) Secondary nodes

The secondary node maintains the same data set with the primary node. When the primary node fails, the secondary nodes participate in the election of a new primary node.

(c) Arbiter nodes

An arbiter node does not store data or participate in the primary node election, but it performs election voting. The arbiter node can reduce the hardware requirements for data storage, as the arbiter runs with minimal demand on hardware resources. However, it is important that one should not deploy the arbiter node on the same server with other data nodes in the production environment.

Note: The number of nodes capable of automatic failover in a replica set must be odd to avoid a tie when it comes to voting at primary node election.

(d) Primary node election

The service will remain unaffected if the secondary node goes down, but if the primary node goes down, the system will initiate a re-election of the primary node:

3

2.Establish a replica set using the arbiter node

The replica set comprises an even number of data nodes, plus an arbiter node:

4

Sharding Technology

When the data amount is relatively large, we need to shard and run the data on different machines to reduce the CPU, memory and IO pressure. Sharding here refers to the "Database sharding technology".

MongoDB sharding technology is similar to the horizontal split and vertical split of MySQL. The database mainly adopts two sharding approaches: vertical expansion and horizontal sharding.
Vertical expansion refers to cluster expansion, namely adding more CPU or memory resources, or disk space.

Horizontal sharding means to shard the data to provide services in a uniform way through the cluster shown below:
5

(1) MongoDB sharding architecture

6

(2) Roles in MongoDB sharding architecture
A. Data shards

A data shard serves to store data to ensure high availability and consistency of data. It can be a separate MongoDB instance, or a replica set.

Shard is generally a replica set in the production environment designed to prevent single points of failures in the data shard. A primary shard exists among all the shards that contain the unsharded data collection:

7

B. Query routers

A router is a mongos instance. The client has a direct connection to mongos, which routes read and write requests to the designated shard.

A sharding cluster can have a mongos instance or multiple mongos instances to ease the pressure from the client's requests.

C. Configuration servers

A configuration server saves the metadata of the cluster, including the routing rules for each shard.

Conclusion:

To make sure that a production cluster has no single point of failure, it important to provide a sharded cluster for high availability. In addition to thoroughly introducing the availability concerns involving MongoDB deployments with highlights of potential failure scenarios and their available resolutions, this post reveals the distinguishing features of the high availability cluster solutions.

Furthermore, it explores the sharding technology and adequately describes the roles of MongoDB sharding architecture as applicable to data shards, query routers and configuration servers.

相关实践学习
MongoDB数据库入门
MongoDB数据库入门实验。
快速掌握 MongoDB 数据库
本课程主要讲解MongoDB数据库的基本知识,包括MongoDB数据库的安装、配置、服务的启动、数据的CRUD操作函数使用、MongoDB索引的使用(唯一索引、地理索引、过期索引、全文索引等)、MapReduce操作实现、用户管理、Java对MongoDB的操作支持(基于2.x驱动与3.x驱动的完全讲解)。 通过学习此课程,读者将具备MongoDB数据库的开发能力,并且能够使用MongoDB进行项目开发。   相关的阿里云产品:云数据库 MongoDB版 云数据库MongoDB版支持ReplicaSet和Sharding两种部署架构,具备安全审计,时间点备份等多项企业能力。在互联网、物联网、游戏、金融等领域被广泛采用。 云数据库MongoDB版(ApsaraDB for MongoDB)完全兼容MongoDB协议,基于飞天分布式系统和高可靠存储引擎,提供多节点高可用架构、弹性扩容、容灾、备份回滚、性能优化等解决方案。 产品详情: https://www.aliyun.com/product/mongodb
目录
相关文章
Query Performance Optimization at Alibaba Cloud Log Analytics Service
PrestoCon Day 2023,链接:https://prestoconday2023.sched.com/event/1Mjdc?iframe=no首页自我介绍,分享题目概要各个性能优化项能够优化的资源类别limit快速短路有什么优点?有啥特征?进一步的优化空间?避免不必要块的生成逻辑单元分布式执行,global 阶段的算子哪些字段无需输出?公共子表达式结合FilterNode和Proje
Query Performance Optimization at Alibaba Cloud Log Analytics Service
|
缓存 前端开发 安全
译|High-Performance Server Architecture(上)
译|High-Performance Server Architecture
71 0
|
存储 缓存 网络协议
译|High-Performance Server Architecture(下)
译|High-Performance Server Architecture(下)
82 0
|
弹性计算 安全 关系型数据库
Deploy Web Apps with High Availability, Fault Tolerance, and Load Balancing on Alibaba Cloud
High Availability (HA), Fault Tolerance (FT), and Horizontal Scale Friendly (HSF) are as equally important as to functionality for web applications to run and succeed today.
4015 0
|
弹性计算 NoSQL 安全
AMP for E-Commerce Part 2: Creating Backend with Alibaba Cloud ApsaraDB for MongoDB
In this three-part tutorial, we will explore how to create a fully functional e-commerce mobile application using AMP.
1587 0
AMP for E-Commerce Part 2: Creating Backend with Alibaba Cloud ApsaraDB for MongoDB
|
NoSQL
An Insight into MongoDB Sharding Chunk Splitting and Migration
Sharding is a method of data distribution across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.
3094 0