MongoDB Transaction BulkWrite Endless Retry

Previously we talked about # How to Retry MongoDB Transaction. However, if you use BulkWrite() and one of the operation is retryable (e.g. duplicated key error), the new transactions API will retry the bulk write endlessly which might lead to server CPU 100%. (MongoDB Server v4.4.6-ent, MongoDB Driver v2.12.2)

To avoid such issue, we have three suggestions:

  • Add a cancellation token to limit the max retry time
  • Break the transaction after max retry count
  • Set BulkWriteOptions { IsOrdered = true }

The first two suggestions are also applicable to transactions which don't use BulkWrite().

Limit Max Transaction Execution Time

First, let's refine the RetryReplaceAsync() by adding CancellationToken, which limits the max execution time of the transaction:

private async Task RetryReplaceAsync(IMongoClient mongoClient, string uuid, string value)
{
	var collection = mongoClient.GetDatabase(DatabaseName).GetCollection<BsonDocument>(CollectionName);

	using (var session = await mongoClient.StartSessionAsync())
	using (var cts = new CancellationTokenSource(TimeSpan.FromSeconds(1)))
	{
		var filter = Builders<BsonDocument>.Filter.Eq("Uuid", uuid);

		await session.WithTransactionAsync(
			async (s, ct) =>
			{
				await collection.ReplaceOneAsync(s, filter, new BsonDocument { { "Uuid", uuid }, { "op", value } }, new ReplaceOptions { IsUpsert = true }, ct);
				return string.Empty;
			},  cancellationToken: cts.Token);
	}
}

Limit Max Transaction Retry Count

Just add 3 lines:

private async Task RetryReplaceAsync(IMongoClient mongoClient, string uuid, string value)
{
    var collection = mongoClient.GetDatabase(DatabaseName).GetCollection<BsonDocument>(CollectionName);

    int count = 0;
    using (var session = await mongoClient.StartSessionAsync())
    using (var cts = new CancellationTokenSource(TimeSpan.FromSeconds(1)))
    {
        var filter = Builders<BsonDocument>.Filter.Eq("Uuid", uuid);

        await session.WithTransactionAsync(
            async (s, ct) =>
            {
                if (++count >= 3)
                {
                    throw new ApplicationException($"Reached max retry times");
                }
                await collection.ReplaceOneAsync(s, filter, new BsonDocument { { "Uuid", uuid }, { "op", value } }, new ReplaceOptions { IsUpsert = true }, ct);
                return string.Empty;
            },  cancellationToken: cts.Token);
    }
}

Execute BulkWrite Operations In Order

The bulk write operation can be executed ordered or unordered. To improve performance, it seems that unordered execution is better. However, the unordered execution might eat some exceptions for a single operation, which leads to endless transaction retry.

As a result, we suggest to execute the operations in order. The code looks like this:

await collection.BulkWriteAsync(session, listWrites, new BulkWriteOptions { IsOrdered = true }, cancellationToken: ct);

Reference

https://developer.mongodb.com/community/forums/t/handling-duplicated-key-error-in-bulk-insert-retry-scenarios/2869

https://stackoverflow.com/questions/61244296/how-to-handle-duplicate-error-in-mongo-bulkinsert-retry