Recently we found the traffic is not balanced across the MongoDB cluster shards. After investigation, the root cause is that data on each shard is not evenly distributed (Chunk balancing != data balancing != traffic balancing). The data distribution looks like this:

Shard Data Size
mongo-0 10.55 GB
mongo-1 25.76 GB
mongo-2 10.04 GB

Why the data size of mongo-1 is significantly large than others while the chunk number among 3 shards is almost the same? Then we need to analysis the chunk size distribution across these shards.

Get the chunk size is not that straightforward. No API is provided to directly give the chunk size. Although the basic chunk information is stored at config.chunks collection, there is no data size field. Consider a collection cxx whose shard key is Uuid, the chunk information looks like this (for one shard):

{
"_id": {
"$oid": "603348a8b74404eea898cc25" }, "lastmod": { "$timestamp": {
"t": 0,
"i": 773
}
},
"lastmodEpoch": {
"$oid": "5ff83e4ba85a6bd465831542" }, "ns": "cxx", "min": { "Uuid": "044B96CA34334FBE860D47BD63339B8E" }, "max": { "Uuid": "0494E87F8ED34B0E9ECF6BE5363BBD3C" }, "shard": "mongo-2", "history": [{ "validAfter": { "$timestamp": {
"t": 6261,
"i": 1632930119
}
},
"shard": "mongo-2"
}
]
}

The good news is that the chunk information contains the min/max key of the chunk. So we can get the chunk size by the dataSize API:

{
dataSize: <string>,
keyPattern: <document>,
min: <document>,
max: <document>,
estimate: <boolean>
}

The intuition is to get the data size of all the documents in the chunk.

For the above example, use the following command in mongo shell to get the chunk size, remember using maxTimeMS to limit max command execution timeout:

db.runCommand({ dataSize: "cxx", keyPattern: { "Uuid": 1 }, min: { "Uuid": "044B96CA34334FBE860D47BD63339B8E" }, max: { "Uuid": "0494E87F8ED34B0E9ECF6BE5363BBD3C" }, maxTimeMS:1000 })

To analyze the chunk size distribution, we can iterate each chunk and get its size by the above command. Put it together, the final script (modify ns and the Uuid field according to your scenario):

var ns = "";
db.getSiblingDB("config").chunks.find({ns : ns}).forEach(function(chunk) {
chunkSize = db.runCommand({ dataSize: ns, keyPattern: { "Uuid": 1 }, min: { "Uuid": chunk.min.Uuid }, max: { "Uuid": chunk.max.Uuid }, maxTimeMS:1000 })
print(chunk.shard + "  " + chunk.min.Uuid + ": " + tojson(chunkSize.size/1024/1024) + "MB")
})

Sample output:

mongo-0  00020DFCE1B64BE28098133E09F0DFF9: 27.868586540222168MB
mongo-0  020FC0A3F7DC456D8CA48787A22BFEF5: 0.03055095672607422MB
mongo-1  02109DCC630C4EB68D890BB1C664C732: 2.049802780151367MB
mongo-1  02394113C1FF4C33BB067299D00F2A57: 0.17893218994140625MB
mongo-0  023CB90536CB4D9A83E88BC362CDA80A: 5.774362564086914MB
...