MongoDB Cluster In Kubernetes(3): Expose UserDB to Public
This is part3, we will expose the user database pods to the public so that Mongo client is able to access it.
MongoDB Ops Manager Series:
- Install MongoDB Ops Manager
- Create a UserDB ReplicaSet
- Expose UserDB to Public
- Openssl Generates Self-signed Certificates
- Enable UserDB TLS and Auth
So far, the user database can be accessed only inside the kubernetes cluster. The official blog's approach is to expose the pods by NodePort: # Connect to a MongoDB Database Resource from Outside Kubernetes
I don't know why the official blog recommend NodePort. Obviously, LoadBalancer is a better way to expose MongoDB. We will use LoadBalancer to expose userdb pods.
Create LoadBalancer Service
Configure userdb0service.yaml
:
apiVersion: v1
kind: Service
metadata:
name: userdb-0-svc-ext
namespace: mongodb
labels:
app: userdb-svc
controller: mongodb-enterprise-operator
pod-anti-affinity: userdb
statefulset.kubernetes.io/pod-name: userdb-0
spec:
type: LoadBalancer
ports:
- protocol: TCP
port: 27017
targetPort: 27017
selector:
app: userdb-svc
controller: mongodb-enterprise-operator
pod-anti-affinity: userdb
statefulset.kubernetes.io/pod-name: userdb-0
Apply it:
$ kubectl apply -f userdb0service.yaml
service/userdb-0-svc-ext created
After a while, the service
userdb-0-svc-ext
will be assigned a public IP. Then we can
bind a domain name to the IP (optional). Suppose the domain name is
userdb0.com
.
Modify the above yaml and create another two services
userdb-1-svc-ext
and userdb-2-svc-ext
to
expose the two remaining pods.
MongoClient
Connection Test, getaddrinfo ENOTFOUND
Error
Now we can use mongo shell to access userdb
:
$ mongo "mongodb://<ip0>:27017,<ip1>:27017,<ip2>:27017/"
MongoDB shell version v4.4.2
connecting to: mongodb://<ip0>:27017,<ip1>:27017,<ip2>:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("6b780874-6ecb-4510-b7e9-61bff04a4711") }
MongoDB server version: 4.2.2
WARNING: shell and server versions do not match
---
The server generated these startup warnings when booting:
2020-12-10T08:32:10.037+0000 I STORAGE [initandlisten]
2020-12-10T08:32:10.037+0000 I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2020-12-10T08:32:10.037+0000 I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
2020-12-10T08:32:10.848+0000 I CONTROL [initandlisten]
2020-12-10T08:32:10.848+0000 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2020-12-10T08:32:10.848+0000 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2020-12-10T08:32:10.848+0000 I CONTROL [initandlisten]
---
MongoDB Enterprise userdb:SECONDARY>
Looks something wrong. Why the prompt shows that we
connect to a SECONDARY
node? The default mongo
readPreference is PRIMARY
.
If you connect the DB through MongoDB Compass
, the
following error appears:
getaddrinfo ENOTFOUND userdb-0.userdb-svc.mongodb.svc.cluster.local
After investigation, this article # Connecting from external sources to MongoDB replica set in Kubernetes fails with getaddrinfo ENOTFOUND error but standalone works answered my question:
When connecting to a replica set, the host:port pairs in the connection string are a seedlist.
The driver/client will attempt to connect to each host in the seedlist in turn until it gets a connection.
It runs the isMaster command to determine which node is primary, and to get a list of all replica set members.
Then is drops the original seedlist connection, and attempts to connect to each replica set member using the host and port information retrieved.
The host information returned by the isMaster usually matches the entry in rs.conf(), which are the hostnames used to initiate the replica set.
In your Kubernetes cluster, the nodes have internal hostnames that are used to initiate the replica set, but that your external clients can't resolve.
In order to get this to work, you will need to have the mongod nodes isMaster command return a different set of hostnames depending on where the client request is coming from. This is similar to split-horizon DNS.
Look over the Deploy a Replica Set documentation for mongodb/kubernetes, and the replicaSetHorizons setting.
Verify by running rs.conf()
in the mongo shell:
MongoDB Enterprise userdb:SECONDARY> rs.conf()
{
"_id" : "userdb",
"version" : 1,
"protocolVersion" : NumberLong(1),
"writeConcernMajorityJournalDefault" : true,
"members" : [
{
"_id" : 0,
"host" : "userdb-0.userdb-svc.mongodb.svc.cluster.local:27017",
...
},
{
"_id" : 1,
"host" : "userdb-1.userdb-svc.mongodb.svc.cluster.local:27017",
...
},
{
"_id" : 2,
"host" : "userdb-2.userdb-svc.mongodb.svc.cluster.local:27017",
...
}
],
"settings" : {
...
}
}
Besides, mongo shell and MongoDB Compass
have different
behaviors when connecting to multiple hosts:
- mongo shell: show connect success but it might not be true: only one node is connected
- MongoDB Compass: connect failed The behavior is caused by different connection policies.
To solve the problem, we need to use spec.connectivity.replicaSetHorizons
to specify the public address. However, to use this setting,
spec.security.tls
must be enabled first:
security:
tls:
enabled: true
connectivity:
replicaSetHorizons:
- "userdb": "userdb0.com:27017"
- "userdb": "userdb1.com:27017"
- "userdb": "userdb2.com:27017"
The bad news is that enable TLS is more complicated than I thought before. Here we use self-signed certificates to encrypt the transport layer communication. Four certificates are required: 1 CA certificate and 3 server certificates.
Let's create the certificates step by step: Openssl Generates Self-signed Certificates