Migrate Ops Manager to a new Kubernetes Cluster
Recently we want to deploy MongoDB Ops Manager and MongoDB deployments in different data centers to improve disaster recovery. If they are deploymented in the same data center and unfortunately it fails, you cannot restore the backup data to a new cluster as both Ops Manager and deployments are unavailable.
Of course, we don't want to re-deploy the existing MongoDB deployments in Kubernetes. But how to make the deployments sending data to the new ops manager URL?
How Ops Manager Works
Previously I thought the Ops Manager and deployments were tightly coupled which is totally wrong. Actually they are independent of each other and can be independently deployed.
Here we deploy them in Kubernetes cluster. However, it's fine to
deploy one in a VM and another in Kubernetes. MongoDB deployments use
MongoDB Agent to send data to Ops Manager, and the Ops Manager uses
groupId
and apiKey
to verify the deployment's
identity and keep track of its status.
Therefore, the problem becomes how to set the new Ops Manager Url, project id and agent key for the old MongoDB deployment.
Let's exec to a deployment pod and see its processes:
I have no name!@shard-1-0:/$ ps aux|grep mongod
2000 51 6.1 0.0 1769520 55632 ? Sl 09:32 0:08 /mongodb-automation/files/mongodb-mms-automation-agent -mmsGroupId 603dfcb99870377752adff58 -pidfilepath /mongodb-automation/mongodb-mms-automation-agent.pid -maxLogFileDurationHrs 24 -logLevel INFO -healthCheckFilePath /var/log/mongodb-mms-automation/agent-health-status.json -useLocalMongoDbTools -mmsBaseUrl http://ops-manager-svc.mongodb.svc.cluster.local:8080 -sslRequireValidMMSServerCertificates -mmsApiKey 603dfcb99870377752adff5ef7c231f11b5xxxxx -logFile /var/log/mongodb-mms-automation/automation-agent.log
2000 57 0.0 0.0 10752 860 ? S 09:32 0:00 tail -F /var/log/mongodb-mms-automation/automation-agent-verbose.log
2000 60 0.0 0.0 10752 868 ? S 09:32 0:00 tail -F /var/log/mongodb-mms-automation/automation-agent-stderr.log
2000 67 0.0 0.0 10752 844 ? S 09:32 0:00 tail -F /var/log/mongodb-mms-automation/mongodb.log
2000 70 0.0 0.0 16532 1264 ? S 09:32 0:00 jq --unbuffered --null-input -c --raw-input inputs | {"logType": "mongodb", "contents": .}
2000 201 2.6 0.1 2086868 107356 ? Sl 09:32 0:02 /var/lib/mongodb-mms-automation/mongodb-linux-x86_64-4.4.3-ent/bin/mongod -f /data/automation-mongod.conf
How does the pod get mmsBaseUrl
, mmsApiKey
and mmsGroupId
? The answer is MongoDB Kubernetes Operator
set envs in the pod. Explore the pod's yaml in the stateful set, it has
following sections in spec.template.spec.containers
:
env:
- name: AGENT_API_KEY
valueFrom:
secretKeyRef:
name: 603dfcb99870377752adff58-group-secret
key: agentApiKey
- name: AGENT_FLAGS
value: '-logFile,/var/log/mongodb-mms-automation/automation-agent.log,'
- name: BASE_URL
value: 'http://ops-manager-svc.mongodb.svc.cluster.local:8080'
- name: GROUP_ID
value: 603dfcb99870377752adff58
- name: LOG_LEVEL
- name: SSL_REQUIRE_VALID_MMS_CERTIFICATES
value: 'true'
- name: USER_LOGIN
value: admin
Steps to Migrate
Make Ops Manager Capable to Manage External Deployments (in new cluster)
The
tutorial configure the Ops Manager Url as internl endpoints
http://ops-manager-svc.mongodb.svc.cluster.local:8080
which
is only accessible to the same Kubernetes cluster. As a result, we must
make the new Ops Manager accessible to the public first.
Refer to: Managing External MongoDB Deployments
Add the mms.centralUrl
setting to
spec.configuration
in the new Ops Manager resource
specification:
spec:
configuration:
mms.centralUrl: https://newopsmanager.com:8080/
Example Ops Manager yaml with specified resource configuration:
apiVersion: mongodb.com/v1
kind: MongoDBOpsManager
metadata:
name: ops-manager
namespace: mongodb
spec:
# the version of Ops Manager distro to use
version: 4.4.6
configuration:
mms.centralUrl: http://newopsmanager.com:8080/
# the name of the secret containing admin user credentials.
adminCredentials: ops-manager-admin-secret
externalConnectivity:
type: LoadBalancer
# the Replica Set backing Ops Manager.
# appDB has the SCRAM-SHA authentication mode always enabled
applicationDatabase:
members: 3
podSpec:
cpu: "1"
cpuRequests: "1"
memory: "10Gi"
memoryRequests: "10Gi"
persistence:
single:
storage: 16Gi
statefulSet:
spec:
template:
spec:
containers:
- name: mongodb-ops-manager
resources:
requests:
cpu: "1"
memory: "15Gi"
limits:
cpu: "1"
memory: "15Gi"
backup:
enabled: true
headDB:
storage: 16Gi
statefulSet:
spec:
template:
spec:
containers:
- name: mongodb-backup-daemon
resources:
requests:
cpu: "1"
memory: "10Gi"
limits:
cpu: "1"
memory: "10Gi"
Edit Deployment Configuration (in old cluster)
Now let's modify MongoDB deployments in old Kubernetes cluster.
Warning: Do this operation when traffic is low, because the deployment will be temporarily unaccessible for several minutes. Each pod of the cluster will be restarted to apply the new setttings.
Create a new configmap new-ops-manager-connection
in the old cluster:
$ kubectl create configmap new-ops-manager-connection --from-literal="baseUrl=http://newopsmanager.com:8080" -n mongodb
configmap/new-ops-manager-connection created
Create new Ops Manager admin secret
new-ops-manager-admin-secret
in the old
cluster, the secret value should be the same as it in the new
cluster:
$ kubectl create secret generic new-ops-manager-admin-key --from-literal="publicKey=xxx" --from-literal="privateKey=xxx" -n mongodb
secret/new-ops-manager-admin-key created
Finally modify the deployment spec.opsManager
section
from:
opsManager:
configMapRef:
name: ops-manager-connection
credentials: ops-manager-admin-key
to:
opsManager:
configMapRef:
name: new-ops-manager-connection
credentials: new-ops-manager-admin-key
Apply the yaml and waiting for the cluster reconciling to running state. The new Ops Manager will automatically create a new project to manage the deployment.
Tips:
- The project id will be changed after migration
- TLS will not be affected during the migration
- If the deployment enabled Auth, after migration the Ops Manager will show "AUTH" status as "Disabled". Don't worry, old users will be kept in the deployment (online traffic will not be affected). We need to make the new Ops Manager and the deployment status be consistent. Just import the users through "Security -> MongoDB Users", enabled the AUTH from the UI then it's done. A short break might be happen during the AUTH status transition process.
Example sharded cluster deployment yaml with resource configuration by podTemplate:
apiVersion: mongodb.com/v1
kind: MongoDB
metadata:
name: sharddb
namespace: mongodb
spec:
shardCount: 2
mongodsPerShardCount: 3
mongosCount: 2
configServerCount: 3
version: 4.4.3-ent
setFeatureCompatibilityVersion: "4.4"
type: ShardedCluster
opsManager:
configMapRef:
name: new-ops-manager-connection
credentials: new-ops-manager-admin-key
exposedExternally: false
shardPodSpec:
podTemplate:
spec:
initContainers:
- name: mongodb-enterprise-init-database
resources:
requests:
cpu: "2"
memory: "15Gi"
limits:
cpu: "4"
memory: "15Gi"
containers:
- name: mongodb-enterprise-database
resources:
requests:
cpu: "2"
memory: "15Gi"
limits:
cpu: "4"
memory: "15Gi"
persistence:
single:
storage: 30Gi
storageClass: managed-premium
configSrvPodSpec:
podTemplate:
spec:
initContainers:
- name: mongodb-enterprise-init-database
resources:
requests:
cpu: "2"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
containers:
- name: mongodb-enterprise-database
resources:
requests:
cpu: "2"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
persistence:
single:
storage: 5Gi
storageClass: managed-premium
mongosPodSpec:
podTemplate:
spec:
initContainers:
- name: mongodb-enterprise-init-database
resources:
requests:
cpu: "2"
memory: "10Gi"
limits:
cpu: "4"
memory: "10Gi"
containers:
- name: mongodb-enterprise-database
resources:
requests:
cpu: "2"
memory: "10Gi"
limits:
cpu: "4"
memory: "10Gi"
security:
tls:
enabled: true
ca: custom-ca