Skip to content

MongoDB Replication (Replica Sets)

MongoDB's replication feature allows us to create multiple copies of data to improve availability and reliability. A replica set is a group of MongoDB servers that includes a primary server and multiple secondary servers. The primary server handles all write operations, while the secondary servers replicate data from the primary server.

Basic Concepts

Replica Set Structure

A replica set typically consists of three types of servers:

  1. Primary: Handles all write operations and syncs data to secondary servers.
  2. Secondary: Replicates data from the primary server and handles read operations.
  3. Arbiter: Does not store data but only participates in the election of a new primary server.

Data Synchronization

When the primary server receives a write operation, it records the operation in the operation log (Oplog). Secondary servers periodically replicate the operation log from the primary server and execute these operations, thereby achieving data synchronization.

Failover

When the primary server fails, the replica set automatically performs a failover to elect a new primary server. The election process involves arbiters to ensure the validity of the election result.

Configuring a Replica Set

Initializing a Replica Set

javascript
// Connect to the primary server
mongo --host primary.example.com

// Initialize the replica set
rs.initiate({
  _id: "myReplicaSet",
  members: [
    { _id: 0, host: "primary.example.com:27017" },
    { _id: 1, host: "secondary1.example.com:27017" },
    { _id: 2, host: "secondary2.example.com:27017" }
  ]
})

Adding Secondary Servers

javascript
// Add a secondary server
rs.add("secondary3.example.com:27017")

// Add an arbiter
rs.addArb("arbiter.example.com:27017")

Checking Replica Set Status

javascript
// Check the status of the replica set
rs.status()

Using a Replica Set

Connecting to a Replica Set

javascript
// Connect to the replica set
mongo --host myReplicaSet/primary.example.com,secondary1.example.com,secondary2.example.com

Read Operations

By default, all read operations are handled by the primary server. We can specify the read preference using the readPreference option.

javascript
// Read from the primary server (default)
db.collection.find().readPref("primary")

// Read from primary or secondary servers
db.collection.find().readPref("primaryPreferred")

// Read from secondary servers
db.collection.find().readPref("secondary")

// Read from secondary servers, or primary if no secondary is available
db.collection.find().readPref("secondaryPreferred")

// Read from the nearest server
db.collection.find().readPref("nearest")

Write Operations

All write operations are handled by the primary server. When the primary server processes a write operation, it records the operation in the operation log and syncs it to the secondary servers.

Disaster Recovery

Manual Failover

javascript
// Perform a manual failover
rs.stepDown(300)

Recovering the Primary Server

When the original primary server recovers, it will join the replica set as a secondary server.

Performance Considerations

  1. Network Latency: The performance of a replica set depends on network latency. If network latency is high, data synchronization and failover times will be longer.
  2. Server Resources: The performance of a replica set depends on the resources of the servers. We should ensure that the servers have sufficient memory, CPU, and disk space.
  3. Oplog Size: The size of the operation log affects the performance of data synchronization. We can adjust the size of the operation log using the oplogSizeMB option.

Common Issues

Replication Lag

Replication lag is the time difference between the secondary servers and the primary server. If replication lag is too large, it may cause data inconsistency.

Election Failures

Election failures can cause the replica set to fail to work properly. We should ensure that there are enough arbiters and secondary servers participating in the election.

Summary

MongoDB's replication feature allows us to create multiple copies of data to improve availability and reliability. A replica set is a group of MongoDB servers that includes a primary server and multiple secondary servers. By using replica sets, we can achieve high data availability and disaster recovery. At the same time, we need to pay attention to performance factors such as network latency, server resources, and operation log size to ensure the efficient operation of the replica set.

Content is for learning and research only.