Finisky Garden

NLP, 软件工程, 产品设计

NLP adapters主要想解决不同任务需要finetune整个模型的痛点,与Prompting一样,是一种轻量级的训练方法,也是Transfer Learning的应用。按出现时间来看,finetune早于adapters,adapters早于prompting。今天来重看这篇Adapters的文章,可以更好地理解lightweight finetune的发展过程。

Adapters在NLP中的应用源于这篇 ICML2019 文章:Parameter-Efficient Transfer Learning for NLP adapters

Adapter-based tuning requires training two orders of magnitude fewer parameters to fine-tuning, while attaining similar performance.

阅读全文 »

Hugo是个极好的静态网站生成器。一个常见的情况是原始的网站源码放在私有代码库中,但希望自动化构建和部署的功能。假设私有代码库托管在github上,希望能自动化部署到GitHub Pages,这个功能可以通过github actions轻松搞定。

开始之前,假设我们已有了两个repo(可以隶属于不同的github账户):

  • 私有库: 存放网站的源码
  • 目标库: GitHub Pages (xxx.github.io)
阅读全文 »

Hugo is a nice static site generator. A common scenario is to store your website source code in a private repository and serve it on GitHub Pages. Can we leverage the github actions to automatically build and deploy the site from the private repository to GitHub Pages? The answer is absolutely yes!

Before we start, you need to have two repos (can belong to different github accounts):

  • Source Repo: the repo to store
  • Target Repo: host the GitHub Pages (xxx.github.io)
阅读全文 »

My first understanding of a language model is originated from n-gram. When I know RNNLM, I have a question: why a neural network can represent a language model?

After some research, I found the answer. Essentially, language model is a probability distribution:

A statistical language model is a probability distribution over sequences of words.

阅读全文 »

最初对语言模型的理解源于n-gram语言模型,但后来出现了RNNLM等一众神经网络语言模型,就有了这个疑问:神经网络为什么可以表示语言模型?

首先,语言模型本质上是概率分布

A statistical language model is a probability distribution over sequences of words.

阅读全文 »

# Scaling Laws for Neural Language Models

一篇实验Paper,调研了神经网络语言模型交叉熵损失变化满足power-law定律,挺有意思的文章。Transformer之后有许多探索不同模型结构的文章,并在一些任务上取得了新的SOTA,却鲜有人考虑影响模型性能的主要因素是什么。

Throughout we will observe precise power-law scalings for performance as a function of training time, context length, dataset size, model size, and compute budget.

阅读全文 »

This blog uses github as image hosting service after migration. However, sometimes the image loading speed is slow. After some investigation, we found that we can easily improve the image loading speed by jsdelivr CDN.

阅读全文 »

在迁移博客之后,就切换了图床,使用github作免费的图床。但最近发现它不太稳定,常常打不开。研究发现可以用jsdelivr作github的CDN加速,只需替换下图片地址即可。这才是github图床正确的打开方式 :-)

阅读全文 »
0%