Don't Use Load Balancer In front of Mongos

When I execute MongoDB transactions in parallel, I encounter lots of MongoCommandException: code 251, codename NoSuchTransaction:

Command find failed: cannot continue txnId 4 for session 38604515-2584-45a5-a17a-5eb5d34ea6c4 - = with txnId 5. Command find failed: cannot continue txnId 4 for session 38604515-2584-45a5-a17a-5eb5d34ea6c4 - = with txnId 6. Command insert failed: cannot continue txnId 31 for session 3ed7ea61-eae1-440f-8d95-b6e066b35b69 - = with txnId 34.

Problem Analysis

I performed some tests to pinpoint the issue:

  • Use single thread to execute transaction, no error
  • Use a dedicated MongoClient for each thread (each transaction), no error

It seems that it's a MongoDB driver bug related to the MongoClient. Then I create a issue here: # Multi-thread Transaction Failure for Sharded Cluster.

However, the culprit is the load balancer in front of mongos. A MongoDB transaction in sharded cluster must be executed in the same mongos instance. If you put a load balancer in front of mongos instances, the operations inside a transaction might be executed in different mongos then the above errors happen.

Refer to: # Transactions issue on sharded cluster

Look at Mongos Pinning:

Drivers MUST send all commands for a single transaction to the same mongos (excluding retries of commitTransaction and abortTransaction).

After the driver selects a mongos for the first command within a transaction, the driver MUST pin the ClientSession to the selected mongos. Drivers MUST send all subsequent commands that are part of the same transaction (excluding certain retries of commitTransaction and abortTransaction) to the same mongos.

Expose Each Mongos as a Dedicated Service

So we need to expose every mongos to public seperately. I finally understand why the official document uses nodeport instead of kubernetes service to expose externally.

Here is a sample mongos-service.yaml which creates a dedicated service for each mongos instance (e.g. 2 mongoses), the trick is to explicitly specify the pod name in the spec.selector:

apiVersion: v1
kind: Service
metadata:
  name: mongos-svc-0
  namespace: mongodb
  labels:
    app: sharddb-svc
    controller: mongodb-enterprise-operator
spec:
  type: LoadBalancer
  ports:
    - protocol: TCP
      port: 27017
      targetPort: 27017
  selector:
    app: sharddb-svc
    controller: mongodb-enterprise-operator
    statefulset.kubernetes.io/pod-name: sharddb-mongos-0

---
apiVersion: v1
kind: Service
metadata:
  name: mongos-svc-1
  namespace: mongodb
  labels:
    app: sharddb-svc
    controller: mongodb-enterprise-operator
spec:
  type: LoadBalancer
  ports:
    - protocol: TCP
      port: 27017
      targetPort: 27017
  selector:
    app: sharddb-svc
    controller: mongodb-enterprise-operator
    statefulset.kubernetes.io/pod-name: sharddb-mongos-1
Refer to: # Deploy MongoDB Sharded Cluster by Ops Manager

Error Reproduce Code

public class TransactionTest
{
    private const string DatabaseName = "Test";
    private const string CollectionName = "Test";
    public const string ConnectionString = "";
    public MongoClient GetMongoClient(int timeout = 5)
    {
        var clientSettings = MongoClientSettings.FromConnectionString(ConnectionString);
        clientSettings.ConnectTimeout = TimeSpan.FromSeconds(5);
        clientSettings.ServerSelectionTimeout = TimeSpan.FromSeconds(timeout);
        clientSettings.AllowInsecureTls = true;
        var mongoClient = new MongoClient(clientSettings);
        return mongoClient;
    }
 
    public async Task TestTransactionAsync()
    {
        var client = GetMongoClient();
        var tasks = new List<Task>();
        for (int i = 0; i < 5; ++i)
        {
            //var client = GetMongoClient(i + 5);
            tasks.Add(DoAsync(client));
        }
        await Task.WhenAll(tasks);
    }
 
    private async Task DoAsync(IMongoClient mongoClient)
    {
        Console.WriteLine("Client hashcode: " + mongoClient.GetHashCode());
        var collection = mongoClient.GetDatabase(DatabaseName).GetCollection<BsonDocument>(CollectionName);
 
        while (true)
        {
            var uuid1 = Guid.NewGuid().ToString("N").Substring(24);
            var uuid2 = Guid.NewGuid().ToString("N").Substring(24);
            try
            {
                using (var session = await mongoClient.StartSessionAsync())
                {
                    session.StartTransaction();
                    await collection.InsertOneAsync(session, new BsonDocument("Uuid", uuid1));
                    await collection.InsertOneAsync(session, new BsonDocument("Uuid", uuid2));
 
                    await session.CommitTransactionAsync();
                }
                Console.WriteLine($"[{uuid1}] [{uuid2}]");
            }
            catch (Exception e)
            {
                Console.WriteLine("$$$ " + e.Message);
            }
        }
    }
}