Migrate Ops Manager to a new Kubernetes Cluster

Recently we want to deploy MongoDB Ops Manager and MongoDB deployments in different data centers to improve disaster recovery. If they are deploymented in the same data center and unfortunately it fails, you cannot restore the backup data to a new cluster as both Ops Manager and deployments are unavailable.

Of course, we don't want to re-deploy the existing MongoDB deployments in Kubernetes. But how to make the deployments sending data to the new ops manager URL?

How Ops Manager Works

Previously I thought the Ops Manager and deployments were tightly coupled which is totally wrong. Actually they are independent of each other and can be independently deployed.

Here we deploy them in Kubernetes cluster. However, it's fine to deploy one in a VM and another in Kubernetes. MongoDB deployments use MongoDB Agent to send data to Ops Manager, and the Ops Manager uses groupId and apiKey to verify the deployment's identity and keep track of its status.

Therefore, the problem becomes how to set the new Ops Manager Url, project id and agent key for the old MongoDB deployment.

Let's exec to a deployment pod and see its processes:

I have no name!@shard-1-0:/$ ps aux|grep mongod
2000         51  6.1  0.0 1769520 55632 ?       Sl   09:32   0:08 /mongodb-automation/files/mongodb-mms-automation-agent -mmsGroupId 603dfcb99870377752adff58 -pidfilepath /mongodb-automation/mongodb-mms-automation-agent.pid -maxLogFileDurationHrs 24 -logLevel INFO -healthCheckFilePath /var/log/mongodb-mms-automation/agent-health-status.json -useLocalMongoDbTools -mmsBaseUrl http://ops-manager-svc.mongodb.svc.cluster.local:8080 -sslRequireValidMMSServerCertificates -mmsApiKey 603dfcb99870377752adff5ef7c231f11b5xxxxx -logFile /var/log/mongodb-mms-automation/automation-agent.log
2000         57  0.0  0.0  10752   860 ?        S    09:32   0:00 tail -F /var/log/mongodb-mms-automation/automation-agent-verbose.log
2000         60  0.0  0.0  10752   868 ?        S    09:32   0:00 tail -F /var/log/mongodb-mms-automation/automation-agent-stderr.log
2000         67  0.0  0.0  10752   844 ?        S    09:32   0:00 tail -F /var/log/mongodb-mms-automation/mongodb.log
2000         70  0.0  0.0  16532  1264 ?        S    09:32   0:00 jq --unbuffered --null-input -c --raw-input inputs | {"logType": "mongodb", "contents": .}
2000        201  2.6  0.1 2086868 107356 ?      Sl   09:32   0:02 /var/lib/mongodb-mms-automation/mongodb-linux-x86_64-4.4.3-ent/bin/mongod -f /data/automation-mongod.conf

How does the pod get mmsBaseUrl, mmsApiKey and mmsGroupId? The answer is MongoDB Kubernetes Operator set envs in the pod. Explore the pod's yaml in the stateful set, it has following sections in spec.template.spec.containers:

env:
  - name: AGENT_API_KEY
    valueFrom:
      secretKeyRef:
        name: 603dfcb99870377752adff58-group-secret
        key: agentApiKey
  - name: AGENT_FLAGS
    value: '-logFile,/var/log/mongodb-mms-automation/automation-agent.log,'
  - name: BASE_URL
    value: 'http://ops-manager-svc.mongodb.svc.cluster.local:8080'
  - name: GROUP_ID
    value: 603dfcb99870377752adff58
  - name: LOG_LEVEL
  - name: SSL_REQUIRE_VALID_MMS_CERTIFICATES
    value: 'true'
  - name: USER_LOGIN
    value: admin

Steps to Migrate

Make Ops Manager Capable to Manage External Deployments (in new cluster)

The tutorial configure the Ops Manager Url as internl endpoints http://ops-manager-svc.mongodb.svc.cluster.local:8080 which is only accessible to the same Kubernetes cluster. As a result, we must make the new Ops Manager accessible to the public first.

Refer to: Managing External MongoDB Deployments

Add the mms.centralUrl setting to spec.configuration in the new Ops Manager resource specification:

spec:
  configuration:
    mms.centralUrl: https://newopsmanager.com:8080/

Example Ops Manager yaml with specified resource configuration:

apiVersion: mongodb.com/v1
kind: MongoDBOpsManager
metadata:
  name: ops-manager
  namespace: mongodb
spec:
  # the version of Ops Manager distro to use
  version: 4.4.6
  configuration:
    mms.centralUrl: http://newopsmanager.com:8080/

  # the name of the secret containing admin user credentials.
  adminCredentials: ops-manager-admin-secret

  externalConnectivity:
    type: LoadBalancer

  # the Replica Set backing Ops Manager. 
  # appDB has the SCRAM-SHA authentication mode always enabled
  applicationDatabase:
    members: 3
    podSpec:
      cpu: "1"
      cpuRequests: "1"
      memory: "10Gi"
      memoryRequests: "10Gi"
      persistence:
        single:
          storage: 16Gi
      
  statefulSet:
    spec:
      template:
        spec:
          containers:
            - name: mongodb-ops-manager
              resources:
                requests:
                  cpu: "1"
                  memory: "15Gi"
                limits:
                  cpu: "1"
                  memory: "15Gi"
  backup:
    enabled: true
    headDB:
      storage: 16Gi
    statefulSet:
      spec:
        template:
          spec:
            containers:
              - name: mongodb-backup-daemon
                resources:
                  requests:
                    cpu: "1"
                    memory: "10Gi"
                  limits:
                    cpu: "1"
                    memory: "10Gi"

Edit Deployment Configuration (in old cluster)

Now let's modify MongoDB deployments in old Kubernetes cluster.

Warning: Do this operation when traffic is low, because the deployment will be temporarily unaccessible for several minutes. Each pod of the cluster will be restarted to apply the new setttings.

Create a new configmap new-ops-manager-connection in the old cluster:

$ kubectl create configmap new-ops-manager-connection --from-literal="baseUrl=http://newopsmanager.com:8080"  -n mongodb  
configmap/new-ops-manager-connection created

Create new Ops Manager admin secret new-ops-manager-admin-secret in the old cluster, the secret value should be the same as it in the new cluster:

$ kubectl create secret generic new-ops-manager-admin-key --from-literal="publicKey=xxx"  --from-literal="privateKey=xxx" -n mongodb  
secret/new-ops-manager-admin-key created

Finally modify the deployment spec.opsManager section from:

opsManager:
  configMapRef:
    name: ops-manager-connection
credentials: ops-manager-admin-key

to:

opsManager:
  configMapRef:
    name: new-ops-manager-connection
credentials: new-ops-manager-admin-key

Apply the yaml and waiting for the cluster reconciling to running state. The new Ops Manager will automatically create a new project to manage the deployment.

Tips:

  • The project id will be changed after migration
  • TLS will not be affected during the migration
  • If the deployment enabled Auth, after migration the Ops Manager will show "AUTH" status as "Disabled". Don't worry, old users will be kept in the deployment (online traffic will not be affected). We need to make the new Ops Manager and the deployment status be consistent. Just import the users through "Security -> MongoDB Users", enabled the AUTH from the UI then it's done. A short break might be happen during the AUTH status transition process.

Example sharded cluster deployment yaml with resource configuration by podTemplate:

apiVersion: mongodb.com/v1
kind: MongoDB
metadata:
  name: sharddb
  namespace: mongodb
spec:
  shardCount: 2
  mongodsPerShardCount: 3
  mongosCount: 2
  configServerCount: 3
  version: 4.4.3-ent
  setFeatureCompatibilityVersion: "4.4"
  type: ShardedCluster
  
  opsManager:
    configMapRef:
      name: new-ops-manager-connection
  credentials: new-ops-manager-admin-key
  exposedExternally: false
  
  shardPodSpec:
    podTemplate:
      spec:
        initContainers:
          - name: mongodb-enterprise-init-database
            resources:
              requests:
                cpu: "2"
                memory: "15Gi"
              limits:
                cpu: "4"
                memory: "15Gi"
        containers:
          - name: mongodb-enterprise-database
            resources:
              requests:
                cpu: "2"
                memory: "15Gi"
              limits:
                cpu: "4"
                memory: "15Gi"
    persistence:
      single:
        storage: 30Gi
        storageClass: managed-premium
  configSrvPodSpec:
    podTemplate:
      spec:
        initContainers:
          - name: mongodb-enterprise-init-database
            resources:
              requests:
                cpu: "2"
                memory: "8Gi"
              limits:
                cpu: "4"
                memory: "8Gi"
        containers:
          - name: mongodb-enterprise-database
            resources:
              requests:
                cpu: "2"
                memory: "8Gi"
              limits:
                cpu: "4"
                memory: "8Gi"
    persistence:
      single:
        storage: 5Gi
        storageClass: managed-premium
  mongosPodSpec:
    podTemplate:
      spec:
        initContainers:
          - name: mongodb-enterprise-init-database
            resources:
              requests:
                cpu: "2"
                memory: "10Gi"
              limits:
                cpu: "4"
                memory: "10Gi"
        containers:
          - name: mongodb-enterprise-database
            resources:
              requests:
                cpu: "2"
                memory: "10Gi"
              limits:
                cpu: "4"
                memory: "10Gi"

  security:
    tls:
      enabled: true
      ca: custom-ca