Finisky Garden

NLP, Software Engineering, Product Design

0%

Recently we found the traffic is not balanced across the MongoDB cluster shards. After investigation, the root cause is that data on each shard is not evenly distributed ( Chunk balancing != data balancing != traffic balancing ). The data distribution looks like this:

ShardData Size
mongo-010.55 GB
mongo-125.76 GB
mongo-210.04 GB

Why the data size of mongo-1 is significantly large than others while the chunk number among 3 shards is almost the same? Then we need to analysis the chunk size distribution across these shards.

After adding new shards to our production MongoDB cluster (v4.4.6-ent with 5 shards, 3 replicas for each shard), we found that the balancer is not working. sh.status() displays many chunk migration errors:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
...
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                7 : Failed with error 'aborted', from mongo-1 to mongo-3
                7208 : Failed with error 'aborted', from mongo-1 to mongo-4
  databases:
        {  "_id" : "X",  "primary" : "mongo-1",  "partitioned" : true,  "version" : {  "uuid" : UUID("xxx"),  "lastMod" : 1 } }
                X.A
                        shard key: { "Uuid" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                mongo-0       231
                                mongo-1       327
                                mongo-2       230
                                mongo-3       208
...

Obviously, the chunks is unbalanced accross shards (327 vs 208). Since the balancer is enabled, we try to debug the issue through mongodb.log on the config server. There are many migration failed logs (sensitive infomation masked):

When a pod in error state (crashloopbackoff), kubernetes would restart the pod. If you try to exec into the pod to check the log or debug, the following error message appears:

1
unable to upgrade connection: container not found ("")

Because the old pod has been killed and you cannot exec into it anymore. So how can we prevent the pod from endless restart?

Just add a command to the deployment yaml to override the default command by the container image. Make the pod never finished by sleep infinity or tail -f /dev/null:

Multiple Backup Daemons are typically run when the storage requirements or the load generated by the deployment is too much for a single daemon.

Directly scale the statefulset ops-manager-backup-daemon to multiple instances (e.g. 3) doesn’t work. Because the mongodb-enterprise-operator is watching the statefulset, the instance number will be scaled down to 1 by the MongoDB operator several miniutes later.

So how to scale up the backup dameons by the MongoDB kubernetes operator?

Apex Compile Error

The environment (cuda10.0):

1
$ conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch

The apex repo master HEAD:

1
2
3
4
commit 0c2c6eea6556b208d1a8711197efc94899e754e1 (HEAD -> master, origin/master, origin/HEAD)
Author: Nan Zheng <80790206+nanz-nv@users.noreply.github.com>
Date:   Sat Jul 17 08:53:59 2021 +0800
...

Install apex:

1
2
3
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Compile error like this:

1
2
3
    csrc/mlp.cpp:127:54: error: expected primary-expression before ‘>’ token
           w_ptr.push_back(inputs[i + 1].data_ptr<scalar_t>());
                                                          ^

Solution

Similar discussions: https://github.com/NVIDIA/apex/issues/802 https://github.com/NVIDIA/apex/issues/1139

By the official manual :

Change streams allow applications to access real-time data changes without the complexity and risk of tailing the oplog. Applications can use change streams to subscribe to all data changes on a single collection, a database, or an entire deployment, and immediately react to them. Because change streams use the aggregation framework, applications can also filter for specific changes or transform the notifications at will.

MongoDB change stream is a nice feature. It allows applications to access real-time data changes without the complexity and risk of tailing the oplog.

Recently, when we use change stream to replicate data from one sharded cluster to another, it immediately made the cluster unstable (broke down several nodes and triggered the primary change). Then the read/write operations latency significantly increased.

Observations

Observations on our production envrionment:

Recently I found that the Google auto ads significantly slows down the page loading speed. There are also many discussions about this. As a static website, fast loading speed is crutial. In this post, we will optimize the PageSpeed Insights score by delay loading auto ads.

First, let’s check the current PSI score in mobile: PSI Mobile

It seems that the Reduce unused JavaScript section has many items to be improved. Check the official Google auto ads script:

Blogroll is natively supported in NexT theme. All links will be shown in the sidebar. However, as your links increases, the sidebar length increases as well. It makes the page lengthy and distracting. Therefore, we consider creating a dedicated blogroll page.

After searching, most of the existing approaches need to modify NexT source code (theme swig template files). The implementation is a little bit complicated while breaks the theme’s integrity. When you update the theme later, you will need to manually merge or rebase the master to your code.

Recently I would like to simplify permanent link for each post. From:

/2021/03/21/migrateopsmanager.en/

To:

/migrateopsmanager.en/

Shorter URL is more concise and readable, as the date string makes no sense to users. But what’s the meaning of URL backward compatibility?

Change Permanent Link Format Issue

According to the official document , changing permalink format is pretty easy: just modify :year/:month/:day/:title/ to :title/ would be OK. However, the real problem is that all existing incoming links will be invalid after this modification. Then the ranking of our site will be affected which is unacceptable.

We have a costly SQL server database with bad performance. Specifically, some store procedures (join several tables on primary key, each table has ~10M rows) were executed for several miniutes. The execution plan showed that the index seek costs 90% of the total time. Finally we found the root cause is the indexes have very high degree of fragmentation. Since its DBA had changed many times, we need to analyze the database schemas, table disk usage and storage procedure dependency tables. Based on these results, we cleanup tables, store procedures and rebuild the indexes to improve the DB performance. Here are the queries to accomplish these tasks.

Recently we want to deploy MongoDB Ops Manager and MongoDB deployments in different data centers to improve disaster recovery. If they are deploymented in the same data center and unfortunately it fails, you cannot restore the backup data to a new cluster as both Ops Manager and deployments are unavailable.

Of course, we don’t want to re-deploy the existing MongoDB deployments in Kubernetes. But how to make the deployments sending data to the new ops manager URL?

Using MongoDB in .NET is easy. However, there are two ways to manipulate the documents in C# code: raw bson document or stronly-typed document. In this article, we will compare the difference of them by examples. Basically, strongly-typed collection is preferred unless you have strong reason to use weakly-typed document (different types in the same collection?).

BsonDocument CRUD

The MongoDB C# Driver Official Document provide examples in this style. I guess the reason is that MongoDB is schemaless and the driver would like to demonstrate how to access document without schema. Actually, noSQL doesn’t means no SQL but stands for not only SQL. Creating a schema for a collection is still recommended because it’s easier to access documents and use index.

MongoDB transaction is a nice feature. Although MongoDB uses optimistic concurrency control, write conflict is unavoidable. The situation becomes worse in multi-document transaction which modifies many documents in one transaction. If a write conflict happens, a MongoDBCommandException will be thrown:

Exception: Command update failed: Encountered error from mongodb.svc.cluster.local:27017 during a transaction :: caused by :: WriteConflict error: this operation conflicted with another operation. Please retry your operation or multi-document transaction..

Today I try to import an existing MongoDB deployment (out of the kubernetes cluster) into MongoDB Ops Manager which is running in kubernetes. After installing MongoDB Agent to the deployment, only automation functionality works while monitoring and backup not work. The root cause is that the agent still try to post data into Ops Manager’s internal endpoint.

The latest MongoDB Agent consists of a single binary that contains all three functions: Automation, Monitoring, and Backup. So theoretically install and configure the all-in-one automation agent in the deployment VM is enough.

I have a sharded cluster (2 shards, each 3 mongods; 3 config server, 2 mongoses) which is deployed by MongoDB Ops Manager.

Last week, one of the shard host status was shown as a grey diamond (Hover: “Last Ping: Never”). Besides, in the Ops Manager’s server page, a server had two processes (e.g. sharddb-0 and sharddb-config). However, the cluster still works well and we can list the host sharddb-0-0(shard 0, replica 0) in the mongo shell by sh.status() and rs.status(). What’s wrong with the cluster?

When I execute MongoDB transactions in parallel, I encounter lots of MongoCommandException: code 251, codename NoSuchTransaction:

Command find failed: cannot continue txnId 4 for session 38604515-2584-45a5-a17a-5eb5d34ea6c4 - = with txnId 5. Command find failed: cannot continue txnId 4 for session 38604515-2584-45a5-a17a-5eb5d34ea6c4 - = with txnId 6. Command insert failed: cannot continue txnId 31 for session 3ed7ea61-eae1-440f-8d95-b6e066b35b69 - = with txnId 34.

Problem Analysis

I performed some tests to pinpoint the issue:

Lightsquid is a handy log analyzer for squid proxy. However, the project is not maintained since 2009.

Today, I found lightsquid doesn’t work in 2021. Why?

Lightsquid Error

When I run lightparser, the output looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
$ ./lightparser.pl access.log.66666
>>> use file :: /var/log/squid/access.log.66666
run TIME: 1 sec
LightSquid parser statistic report

                 3312 lines processed (average 3312.00 lines per second)
                    0 lines parsed
                    0 lines recovered
                    0 lines notrecovered
                    0 lines skiped by bad year
                 3312 lines skiped by date filter
                    0 lines skiped by Denied filter
                    0 lines skiped by skipURL filter

WARNING !!!!, parsed 0 lines from total : 3312
please check confiuration !!!!
may be wrong log format selected ?

Seems that all lines are filtered by the date filter. I double checked the log format and make sure it is correct and the same with previous days.

MongoDB sharded cluster is the most complicated architecture. The deployment of sharded cluster in Kubernetes is relatively hard. We will go through the deployment process by MongoDB Ops Manager in this post.

Before start, please go through the Create a UserDB ReplicaSet first.

A MongoDB sharded cluster consists of the following components:

  • shard : Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set .
  • mongos : The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
  • config servers : Config servers store metadata and configuration settings for the cluster.

In this post, we are going to create a sharded cluster with 2 shards (3 instances replica set), 2 mongos and 3 config servers.

This is part5, we will use the generated certficates to enable user database TLS and AUTH.

MongoDB Ops Manager Series:

  1. Install MongoDB Ops Manager
  2. Create a UserDB ReplicaSet
  3. Expose UserDB to Public
  4. Openssl Generates Self-signed Certificates
  5. Enable UserDB TLS and Auth

Understanding Different Secure Connections

Before we start, look at the following three TLS: