Last month, I found myself tasked with setting up a production-grade MongoDB cluster for our growing application. After weeks of testing, tweaking, and occasionally pulling my hair out, I’ve got a rock-solid setup that I want to share. This isn’t just another tutorial - it’s my battle-tested approach that’s currently running in production.

Starting With the Basics: Understanding MongoDB Clusters Link to heading

Before I dive into the setup, let me share what I learned about MongoDB clustering options. When I started this project, I had to make a crucial decision between replica sets and sharded clusters. Here’s what my research and experience taught me:

Replica Sets: The High-Availability Solution Link to heading

Think of a replica set as your database’s insurance policy. It’s like having multiple copies of your data, each ready to step in if something goes wrong. Here’s what I love about replica sets:

  • It’s essentially MongoDB’s way of saying “I’ve got your back” - if your primary server fails, another one takes over automatically
  • Your application keeps running even if a server decides to take a vacation
  • You can spread your read operations across multiple servers (this saved us during high-traffic periods)
  • It’s significantly easier to manage compared to sharded clusters

image

Sharded Clusters: The Scale-Out Beast Link to heading

Now, sharded clusters are a different animal altogether. Imagine splitting your data across multiple servers, each handling its own piece of the puzzle. Here’s when you might want to go this route:

  • Your data is growing faster than your biggest server can handle
  • You need to write data faster than a single server can manage
  • You want to spread your data across different geographical locations
  • You’re dealing with truly massive datasets (we’re talking terabytes)

I chose a replica set for our setup because our data size was manageable, but we needed rock-solid reliability. Let me walk you through exactly how I set it up.

image

My Production Setup Journey Link to heading

The Hardware Foundation Link to heading

I went with three Hetzner dedicated servers. Why Hetzner? Good price-to-performance ratio and reliable network - pretty crucial when you’re building a distributed system. Each server is identical, which makes management way simpler.

Setting Up the File Structure Link to heading

First thing I did was create a clean workspace on each server:

mkdir -p ~/mongodb
cd ~/mongodb

Creating a Production-Grade Docker Setup Link to heading

Here’s the docker-compose.yml I landed on after several iterations. I’ll explain the important bits:

services:
  mongo:
    image: mongo:7.0
    command: ["mongod", "--config", "/etc/mongod.conf", "--replSet", "rs0", "--bind_ip_all"]
    ports:
      - 27017:27017
    environment:
      MONGO_INITDB_ROOT_USERNAME: root
      MONGO_INITDB_ROOT_PASSWORD: your_secure_password
      MONGO_INITDB_DATABASE: admin
    volumes:
      - mongo_data:/data/db
      - mongo_config:/data/configdb
      - ./mongod.conf:/etc/mongod.conf:ro
      - ./mongodb-keyfile:/data/configdb/mongodb-keyfile:ro
      - mongo_logs:/var/log/mongodb
    user: "999:999"
    ulimits:
      nofile:
        soft: 1048576
        hard: 1048576
      nproc:
        soft: 1048576
        hard: 1048576
      memlock:
        soft: -1
        hard: -1
    deploy:
      resources:
        limits:
          cpus: '75'
          memory: 110G
        reservations:
          cpus: '75'
          memory: 110G
    sysctls:
      net.core.somaxconn: 65535
      net.ipv4.tcp_max_syn_backlog: 65535
      net.ipv4.tcp_fin_timeout: 30
      net.ipv4.tcp_keepalive_time: 300
      net.ipv4.tcp_keepalive_intvl: 30
      net.ipv4.tcp_keepalive_probes: 5
    networks:
      - mongodb_network

networks:
  mongodb_network:
    driver: bridge
    driver_opts:
      com.docker.network.driver.mtu: 9000

volumes:
  mongo_data:
  mongo_config:
  mongo_logs:

Those resource limits? They’re not random numbers. I spent time monitoring our application’s behavior and adjusted them based on real usage patterns.

MongoDB Configuration That Actually Works Link to heading

Here’s my mongod.conf file, battle-tested in production:

net:
  port: 27017
  bindIp: 0.0.0.0
  maxIncomingConnections: 300000

security:
  authorization: enabled
  keyFile: /data/configdb/mongodb-keyfile

replication:
  replSetName: rs0

storage:
  wiredTiger:
    engineConfig:
      cacheSizeGB: 100
      journalCompressor: zstd
    collectionConfig:
      blockCompressor: zstd

operationProfiling:
  mode: off

setParameter:
  maxTransactionLockRequestTimeoutMillis: 5000
  transactionLifetimeLimitSeconds: 60

systemLog:
  destination: file
  path: "/var/log/mongodb/mongod.log"
  logAppend: true

processManagement:
  fork: false
  timeZoneInfo: /usr/share/zoneinfo

That 100GB cache size? It’s not arbitrary - it’s about 80% of our available RAM, which is MongoDB’s sweet spot for performance.

Security: Because Sleep is Nice Link to heading

Security was a major concern. I generated a keyfile for internal authentication:

openssl rand -base64 756 > mongodb-keyfile
chmod 400 mongodb-keyfile

This keyfile acts like a shared secret between our MongoDB instances. Without it, random MongoDB instances can’t join our replica set. It’s like having a secret handshake. 🤝 One time generated keyfile should be copied to all the servers.

DNS Setup: Making It All Connect Link to heading

Setting up proper DNS records is crucial for MongoDB cluster operation. I’ll walk through how to configure both A records and SRV records in Cloudflare (the process is similar for other DNS providers).

Setting up A Records Link to heading

First, we need to create A records for each MongoDB node:

  1. Log into your DNS provider (Cloudflare in our case)
  2. Create the following A records:
    # Primary Node
    Type: A
    Name: mongo1
    Content: <Your-Server-1-IP>
    TTL: Auto
    Proxy status: DNS only
    
    # Secondary Node 1
    Type: A
    Name: mongo2
    Content: <Your-Server-2-IP>
    TTL: Auto
    Proxy status: DNS only
    
    # Secondary Node 3
    Type: A
    Name: mongo3
    Content: <Your-Server-3-IP>
    TTL: Auto
    Proxy status: DNS only
    

Important: Set Proxy status to “DNS only” (gray cloud in Cloudflare) rather than proxied. MongoDB nodes should connect directly to each other.

Setting up SRV Records Link to heading

Next, create SRV records for automatic MongoDB discovery. The SRV records help MongoDB drivers automatically discover all nodes in your cluster:

  1. Create SRV records for each node:
    # Primary Node SRV Record
    Type: SRV
    Name: _mongodb._tcp.db
    Target: mongo1.yourdomain.com
    Priority: 0
    Weight: 5
    Port: 27017
    TTL: Auto
    
    # Secondary Node 1 SRV Record
    Type: SRV
    Name: _mongodb._tcp.db
    Target: mongo2.yourdomain.com
    Priority: 0
    Weight: 5
    Port: 27017
    TTL: Auto
    
    # Secondary Node 2 SRV Record
    Type: SRV
    Name: _mongodb._tcp.db
    Target: mongo3.yourdomain.com
    Priority: 0
    Weight: 5
    Port: 27017
    TTL: Auto
    

Record Details Explained Link to heading

  • A Records:

    • Point directly to your server IP addresses
    • Enable direct node-to-node communication
    • Should be DNS-only (not proxied) for proper cluster operation
  • SRV Records:

    • _mongodb._tcp prefix is required for MongoDB service discovery
    • db subdomain can be whatever you choose
    • Priority of 0 means equal priority for all nodes
    • Weight of 5 ensures equal load distribution
    • Port 27017 is MongoDB’s default port

Testing DNS Configuration Link to heading

After setting up your records, verify them using these commands:

# Test A records
dig mongo1.yourdomain.com
dig mongo2.yourdomain.com
dig mongo3.yourdomain.com

# Test SRV records
dig srv _mongodb._tcp.db.yourdomain.com

The SRV lookup should return all three nodes with their priorities and weights.

Bringing It All Together Link to heading

After setting up each server, I initialized the replica set:

rs.initiate({
  _id: "rs0",
  members: [
    { _id: 0, host: "mongo1.yourdomain.com:27017", priority: 1 },
    { _id: 1, host: "mongo2.yourdomain.com:27017", priority: 0.5 },
    { _id: 2, host: "mongo3.yourdomain.com:27017", priority: 0.5 }
  ]
})

The Connection String That Makes It Work Link to heading

Here’s how our applications connect to this setup:

mongodb+srv://root:your_password@db.yourdomain.com/your_database?replicaSet=rs0&readPreference=secondaryPreferred&ssl=false&authSource=admin

Real-World Performance Notes Link to heading

After running this setup for a while, I’ve noticed:

  • Write operations consistently complete in under 50ms
  • Read operations from secondaries average around 20ms
  • Failovers, when they happen, complete in under 10 seconds
  • Our application hasn’t experienced any downtime due to database issues

Monitoring and Maintenance Link to heading

I regularly check the replica set’s health using:

rs.status()

This command gives me everything I need to know about:

  • Each member’s state
  • Replication lag
  • Election status
  • Synchronization health

Host System Optimization Link to heading

Before deploying our MongoDB cluster, it’s crucial to optimize each host server for high-performance database operations. I’ve created a comprehensive optimization script that configures various system parameters for optimal MongoDB performance.

Script Overview Link to heading

https://gist.github.com/polymatx/cc067da36aa9293d42839f4fe9f09673

What’s Next? Link to heading

I’m currently exploring:

  • Automated backup strategies
  • Monitoring solutions
  • Performance optimization techniques
  • Implement a sharding strategy

I’ll share my findings in future posts as I implement and test these improvements.

Conclusion Link to heading

Setting up a MongoDB cluster isn’t a one-size-fits-all process. What I’ve shared here is what worked for our specific needs - a balance of reliability, performance, and manageability. Your mileage may vary, but these principles should give you a solid foundation to build on.

Remember: the best database setup is the one that lets you sleep at night without worrying about data loss or downtime. This setup has given me that peace of mind, and I hope it helps you achieve the same.

Feel free to reach out if you have questions about any part of this setup. I’m always happy to help fellow developers avoid the pitfalls I encountered along the way.