0%

# Scaling Laws for Neural Language Models

一篇实验Paper,调研了神经网络语言模型交叉熵损失变化满足power-law定律,挺有意思的文章。Transformer之后有许多探索不同模型结构的文章,并在一些任务上取得了新的SOTA,却鲜有人考虑影响模型性能的主要因素是什么。

Throughout we will observe precise power-law scalings for performance as a function of training time, context length, dataset size, model size, and compute budget.

Read more »

This blog uses github as image hosting service after migration. However, sometimes the image loading speed is slow. After some investigation, we found that we can easily improve the image loading speed by jsdelivr CDN.

Read more »

在迁移博客之后,就切换了图床,使用github作免费的图床。但最近发现它不太稳定,常常打不开。研究发现可以用jsdelivr作github的CDN加速,只需替换下图片地址即可。这才是github图床正确的打开方式 :-)

Read more »

之前我们谈到 # MongoDB事务重试实现. 如果在事务中使用了BulkWrite(),那么这个新的事务API可能会无限重试从而导致服务器CPU使用率100% (MongoDB Server v4.4.6-ent, MongoDB Driver v2.12.2)。

为避免这个问题,有三个客户端实现的建议:

  • 事务API传入cancellation token,限制事务的最长执行时间
  • 超过最大重试次数后强制退出事务,避免无限重试
  • 设置BulkWrite()按序执行

其中前两个建议对所有事务实现都建议使用,可避免极端情况下对服务端造成不必要的负载。

Read more »

Previously we talked about # How to Retry MongoDB Transaction. However, if you use BulkWrite() and one of the operation is retryable (e.g. duplicated key error), the new transactions API will retry the bulk write endlessly which might lead to server CPU 100%. (MongoDB Server v4.4.6-ent, MongoDB Driver v2.12.2)

To avoid such issue, we have three suggestions:

  • Add a cancellation token to limit the max retry time
  • Break the transaction after max retry count
  • Set BulkWriteOptions { IsOrdered = true }

The first two suggestions are also applicable to transactions which don't use BulkWrite().

Read more »

Prompting is one of the hottest NLP techniques. This is a brief introduction to prompting by three questions: what's prompting, why prompting and how to prompting. As a brief introduction, we do not cover too much details but try to summarize the main idea of prompting. For more details, please refer to the original papers.

What's Prompting

I don't find a rigorous defintion for prompting. Just quoting some pieces from papers.

Users prepend a natural language task instruction and a few examples to the task input; then generate the output from the LM. This approach is known as in-context learning or prompting.

By: # Prefix-Tuning: Optimizing Continuous Prompts for Generation

This description brought two concepts: in-context learning and prompting.

Another explantion from probability perspective:

Prompting is the approach of adding extra information for the model to condition on during its generation of Y .

By: # The Power of Scale for Parameter-Efficient Prompt Tuning

Read more »

Prompt是当下最热的NLP技术之一,本文通过 what, why 和 how 三个问题对它进行介绍。力求简明扼要,不是完整综述,更多细节,可参考更多论文原文。

Prompt是什么

首先来看什么是Prompt,没有找到权威定义,引用一些论文中的描述来说明什么是Prompt。

Users prepend a natural language task instruction and a few examples to the task input; then generate the output from the LM. This approach is known as in-context learning or prompting.

By: # Prefix-Tuning: Optimizing Continuous Prompts for Generation

简单来说,用户用一段任务描述和少量示例作为输入,然后用语言模型生成输出。这种方法就叫做in-context learningprompting。Prompting也有另一种偏概率的解释:

Prompting is the approach of adding extra information for the model to condition on during its generation of Y .

By: # The Power of Scale for Parameter-Efficient Prompt Tuning

Read more »

Recently I switched the static website generator from Hexo to Hugo. The main reason is that Hexo is too slow, cannot generate websites with thousands of pages.

Then I found this: # Who Should Use Hugo?

Hugo is for people building a blog, a company site, a portfolio site, documentation, a single landing page, or a website with thousands of pages.

No pain no gain. To use Hugo smoothly, the first problem is how to add customized css or js to the site.

The bad news is that Hugo is not that friendly to a freshman. I spent hours to read documents and understood how it works. Comparatively, Hexo's plugin system and injector is more friendly to a freshman (maybe I forgot how long to learn it haha :-D).

In this post, we will go through how to add customized css or js to Hugo sites in Hugoic way.

If you don't want to understand how it works, just go to Modify Templates section would be fine. :-)

Read more »