Finisky Garden

NLP, 软件工程, 产品设计

之前我们谈到如何 从私有代码库自动部署Hugo站到GitHub Pages 。以为将之前的workflow yaml修改为Hexo的版本非常容易,亲自试了下发现打脸了。原因在于Hexo的依赖很多,因此环境配置比Hugo就复杂很多,同时还兼有各种包和库的兼容性问题。相比之下,Hugo就显得非常干净,使用GitHub Action容易不少。

花了好多时间并且尝试不了下20次,才将Hexo的action workflow最终调通 :-) ,记录下踩过的坑和解决文案。

与Hugo的workflow相比,需要解决如下几个问题:

  • 主题目录的submodule配置
  • 使用PAT同时拉取两个私有库(主库及主题submodule)的代码
  • Pandoc在GitHub Action中的安装(可选)

开始吧!

阅读全文 »

Last time we talked how to deploy Hugo from private repository to GitHub Pages . I thought it is trivial to modify the workflow yaml to make it works for Hexo. However, it is much more complicated than I ever thought. The reason is that Hexo has to setup more dependencies with compatibility issues while Hugo is relatively self-contained and clean.

Actually, I spent several hours and attempted more than 20 times to make the workflow yaml works. :-)

Compared to Hugo workflow, there are several issues to be resolved:

  • Setup private submodules for your themes
  • Configure PAT to pull private submodules as well as your main repo
  • Pandoc installation for mathjax (optional)

Let's start!

阅读全文 »

NLP adapters主要想解决不同任务需要finetune整个模型的痛点,与Prompting一样,是一种轻量级的训练方法,也是Transfer Learning的应用。按出现时间来看,finetune早于adapters,adapters早于prompting。今天来重看这篇Adapters的文章,可以更好地理解lightweight finetune的发展过程。

Adapters在NLP中的应用源于这篇 ICML2019 文章:Parameter-Efficient Transfer Learning for NLP adapters

Adapter-based tuning requires training two orders of magnitude fewer parameters to fine-tuning, while attaining similar performance.

阅读全文 »

Hugo是个极好的静态网站生成器。一个常见的情况是原始的网站源码放在私有代码库中,但希望自动化构建和部署的功能。假设私有代码库托管在github上,希望能自动化部署到GitHub Pages,这个功能可以通过github actions轻松搞定。

开始之前,假设我们已有了两个repo(可以隶属于不同的github账户):

  • 私有库: 存放网站的源码
  • 目标库: GitHub Pages (xxx.github.io)
阅读全文 »

Hugo is a nice static site generator. A common scenario is to store your website source code in a private repository and serve it on GitHub Pages. Can we leverage the github actions to automatically build and deploy the site from the private repository to GitHub Pages? The answer is absolutely yes!

Before we start, you need to have two repos (can belong to different github accounts):

  • Source Repo: the repo to store
  • Target Repo: host the GitHub Pages (xxx.github.io)
阅读全文 »

My first understanding of a language model is originated from n-gram. When I know RNNLM, I have a question: why a neural network can represent a language model?

After some research, I found the answer. Essentially, language model is a probability distribution:

A statistical language model is a probability distribution over sequences of words.

阅读全文 »

最初对语言模型的理解源于n-gram语言模型,但后来出现了RNNLM等一众神经网络语言模型,就有了这个疑问:神经网络为什么可以表示语言模型?

首先,语言模型本质上是概率分布

A statistical language model is a probability distribution over sequences of words.

阅读全文 »

# Scaling Laws for Neural Language Models

一篇实验Paper,调研了神经网络语言模型交叉熵损失变化满足power-law定律,挺有意思的文章。Transformer之后有许多探索不同模型结构的文章,并在一些任务上取得了新的SOTA,却鲜有人考虑影响模型性能的主要因素是什么。

Throughout we will observe precise power-law scalings for performance as a function of training time, context length, dataset size, model size, and compute budget.

阅读全文 »
0%