After Deploy Hexo From Private Repository to GitHub Pages, we encounter many issues: GitHub Checkout Action Preserve File Modification Time, and now some posts' permalinks date may shift one day. For instance, assume the original markdown date is 2020-07-13 00:50:05, the generated permalinks date becomes 2020/07/12. Since the permalinks changed, search engines will regard these posts are not found which impact the SEO performance.

使用深度模型进行检索,主要矛盾是检索性能与速度的平衡。 本文对几篇经典的文本检索模型工作DPR, Poly-Encoders, DC-BERT 与 ColBERT 的主要思想进行介绍与对比。

[EMNLP2020] Dense Passage Retrieval for Open-Domain Question Answering

[ICLR2020] Poly-encoders:Architectures and Pre training Strategies for Fast and Accurate Multi sentence Scoring

[SIGIR2020] DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding

[SIGIR2020] ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

# 从私有代码库自动部署Hexo站到GitHub Pages, 我们用GitHub Action实现了自动化部署Hexo站。但还存在一个问题,在每次部署后所有文章的修改时间都变成了当前时间,而非实际的修改时间。这样的问题在于所有历史文章在每次部署之后都会发生变化,会让搜索引擎误认为这个网站时常改动。

经过分析发现,Hexo正是使用文件修改时间作为文章的最后编辑时间,但 git从设计上就不保留文件的修改时间 。在checkout之后,所有markdown文件的修改时间都变成了当前时间。

By # Deploy Hexo From Private Repository to GitHub Pages, we can leverage GitHub Actions to automatically deploy the Hexo website. However, for each deployment commit, the post's edit time will be changed to the current time instead of actual modification time. It may mislead the search engine to regard the website as a frequently modified site.

By default, Hexo uses the post file modification time as its edit time. By design, git doesn't preserve the file modification time (refer to this). After checkout action, the file modification time will be the current time.

之前我们谈到 Adapters 与 Prompting 都是轻量级的训练方法,所谓 lightweight-finetuning。今天来看一下另一种轻量级训练大语言模型的方法:

LoRA: Low-Rank Adaptation of Large Language Models


An important paradigm of natural language processing consists of large-scale pretraining on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example – deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive.

After I upgraded pandoc from to 2.18, hexo-renderer-pandoc cannot render one of my post correctly. Everything works fine in The error looks like:

INFO Start processing FATAL { err: Error: [ERROR][hexo-renderer-pandoc] On /home/finisky/source/_posts/test.md [ERROR][hexo-renderer-pandoc] pandoc exited with code 64: YAML parse exception at line 4, column 0, while scanning a simple key: could not find expected ':'

  at Hexo.pandocRenderer (/home/finisky/node_modules/hexo-renderer-pandoc/index.js:114:11)
  at Hexo.tryCatcher (/home/finisky/node_modules/bluebird/js/release/util.js:16:23)
  at Hexo.<anonymous> (/home/finisky/node_modules/bluebird/js/release/method.js:15:34)
  at /home/finisky/node_modules/hexo/lib/hexo/render.js:75:22
  at tryCatcher (/home/finisky/node_modules/bluebird/js/release/util.js:16:23)
  at Promise._settlePromiseFromHandler (/home/finisky/node_modules/bluebird/js/release/promise.js:547:31)
  at Promise._settlePromise (/home/finisky/node_modules/bluebird/js/release/promise.js:604:18)
  at Promise._settlePromiseCtx (/home/finisky/node_modules/bluebird/js/release/promise.js:641:10)
  at _drainQueueStep (/home/finisky/node_modules/bluebird/js/release/async.js:97:12)
  at _drainQueue (/home/finisky/node_modules/bluebird/js/release/async.js:86:9)
  at Async._drainQueues (/home/finisky/node_modules/bluebird/js/release/async.js:102:5)
  at Immediate.Async.drainQueues [as _onImmediate] (/home/finisky/node_modules/bluebird/js/release/async.js:15:14)
  at processImmediate (node:internal/timers:464:21)

} Something's wrong. Maybe you can find the solution here: %s https://hexo.io/docs/troubleshooting.html

之前我们谈到如何 从私有代码库自动部署Hugo站到GitHub Pages 。以为将之前的workflow yaml修改为Hexo的版本非常容易,亲自试了下发现打脸了。原因在于Hexo的依赖很多,因此环境配置比Hugo就复杂很多,同时还兼有各种包和库的兼容性问题。相比之下,Hugo就显得非常干净,使用GitHub Action容易不少。

花了好多时间并且尝试不了下20次,才将Hexo的action workflow最终调通 :-) ,记录下踩过的坑和解决文案。


  • 主题目录的submodule配置
  • 使用PAT同时拉取两个私有库(主库及主题submodule)的代码
  • Pandoc在GitHub Action中的安装(可选)


Last time we talked how to deploy Hugo from private repository to GitHub Pages . I thought it is trivial to modify the workflow yaml to make it works for Hexo. However, it is much more complicated than I ever thought. The reason is that Hexo has to setup more dependencies with compatibility issues while Hugo is relatively self-contained and clean.

Actually, I spent several hours and attempted more than 20 times to make the workflow yaml works. :-)

Compared to Hugo workflow, there are several issues to be resolved:

  • Setup private submodules for your themes
  • Configure PAT to pull private submodules as well as your main repo
  • Pandoc installation for mathjax (optional)

Let's start!

