0%

When a pod in error state (crashloopbackoff), kubernetes would restart the pod. If you try to exec into the pod to check the log or debug, the following error message appears:

unable to upgrade connection: container not found ("")

Because the old pod has been killed and you cannot exec into it anymore. So how can we prevent the pod from endless restart?

Read more »

Multiple Backup Daemons are typically run when the storage requirements or the load generated by the deployment is too much for a single daemon.

Directly scale the statefulset ops-manager-backup-daemon to multiple instances (e.g. 3) doesn't work. Because the mongodb-enterprise-operator is watching the statefulset, the instance number will be scaled down to 1 by the MongoDB operator several miniutes later.

So how to scale up the backup dameons by the MongoDB kubernetes operator?

Read more »

Change stream是什么?官方文档:

Change streams allow applications to access real-time data changes without the complexity and risk of tailing the oplog. Applications can use change streams to subscribe to all data changes on a single collection, a database, or an entire deployment, and immediately react to them. Because change streams use the aggregation framework, applications can also filter for specific changes or transform the notifications at will.

这里我们利用change stream来做实时的主从复制。网上没有找到相应的方案,想必是因为直接的做法可能是通过replica set来完成,不会手动进行主从复制。但业务层是有这样的需求的,比如跨地区的异构集群数据备份。

已有的轮子只找到了 MongoShake 。但MongoShake毕竟不是商业项目,代码拉下来运行时发现并不能在我们的环境中正常工作:

  • TLS验证有些问题,通过修改源码解决了
  • all同步模式下,只能通过oplog进行了全量复制,在用change stream进行增量复制时不停抛错,无法正常运行

考虑到改轮子可能比造个轮子更费劲,就研究了下如何自己做主从复制。最简单的原理就是从源库实时地读oplog,然后在目标库上重放oplog。说起来简单,但实现起来可能没那么容易,尤其在源库是分片集群时,不能直接用mongos拉oplog,而要手动从不同的shard上拉取数据,实现难度较高。

好消息是在MongoDB v3.6之后有了change stream功能,再加上我们使用MongoDB Ops Manager做分片集群的管理,可以轻松地做快照恢复,那么主从复制要做的就是从快照时间点之后重放实时的改动。

看起来这轮子自己能造。

Read more »

By the official manual:

Change streams allow applications to access real-time data changes without the complexity and risk of tailing the oplog. Applications can use change streams to subscribe to all data changes on a single collection, a database, or an entire deployment, and immediately react to them. Because change streams use the aggregation framework, applications can also filter for specific changes or transform the notifications at will.

Here we leverage change stream to replicate data from one MongoDB to another in realtime.

There is some existing tools such as MongoShake do the same thing. However, MongoShake is a little bit complicated to use. We encoutered two issues:

  • Modify the source code to use TLS authentication
  • Cannot perform increment sync in all sync_mode

Since our goal is realtime replication, we choose a more straightforward and controllable way: MongoDB Ops Manager cluster restore and change stream to apply realtime changes.

Read more »

MongoDB change stream is a nice feature. It allows applications to access real-time data changes without the complexity and risk of tailing the oplog.

Recently, when we use change stream to replicate data from one sharded cluster to another, it immediately made the cluster unstable (broke down several nodes and triggered the primary change). Then the read/write operations latency significantly increased.

Read more »

升级Hexo到v8.5.0之后,发现mathjax不能正确显示公式。看了下文档,发现推荐的hexo renderer是hexo-renderer-pandoc,而目前使用的是hexo-renderer-kramed,而且这个包已经不再更新也不推荐使用了。

那就换用hexo-renderer-pandoc,虽然公式能正常渲染,但又有新的问题,一是内嵌html不能正确识别,另一个是引用和列表展示不换行。

Read more »

自嘲

本是后山人, 偶做前堂客。 醉舞经阁半卷书, 坐井说天阔。

大志戏功名, 海斗量福祸。 论到囊中羞涩时, 怒指乾坤错。

《天道》,丁元英

Recently I found that the Google auto ads significantly slows down the page loading speed. There are also many discussions about this. As a static website, fast loading speed is crutial. In this post, we will optimize the PageSpeed Insights score by delay loading auto ads.

Read more »