Finisky Garden

ChatGPT的未解难题

发表于 2022-12-10 更新于 2023-07-23 分类于 Machine Learning 评论：阅读次数：

整个圈子最近都被ChatGPT出色的对话和Coding能力惊艳到了，前面写了篇文章简析了下其原理，虽然看起来直观，但国内的对话水平与其差距确非一日之功。下面的知乎回答深以为然：

Why China Doesn't Have ChatGPT

ChatGPT: Optimizing Language Models for Dialogue

既然大家都致力于发掘ChatGPT厉害的地方，就来找找它的不足吧。

阅读全文 »

InstructGPT/ChatGPT 简读

发表于 2022-12-10 分类于 Machine Learning 评论：阅读次数：

最近ChatGPT火爆出圈，一众朋友发来各种网红文问我怎么看。ChatGPT的模型与InstructGPT一样，只是数据收集方式有区别。而InstructGPT的提出已差不多有一年了，只不过最近才引起大家的注意。其实，今年已经有不少工作是延续InstructGPT对提升模型效果的，如 Diamonte，参考了human feedback的思路，但将RL的方案替换成了额外的loss fuction项；WeLM，参考了人工编写prompt模板训练大规模语言模型。

话不多说，来看看原始的InstructGPT是如何打败大模型的。原始Paper很长，有68页，而事实上核心思想并不复杂。（PS: 现在训练个大模型要不写个50页以上的Paper，都对不起咱烧的那钱！）

Training language models to follow instructions with human feedback

Aligning Language Models to Follow Instructions

InstructGPT指出，模型并非越大越好：

Making language models bigger does not inherently make them better at following a user’s intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user.

所以InstrcutGPT希望通过人工反馈让语言模型与用户意图更加align：

We show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.

最终训练出来1.3B的InstructGPT模型，人工评测比175B的GPT-3要更好：

In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters.

阅读全文 »

When Using Proxy in Chrome, DNS Is Performed on Which Side?

发表于 2022-11-27 更新于 2023-01-24 分类于 Networking 评论：阅读次数：

When using proxy (e.g. SwitchyOmega) in Chrome, name resolution (DNS) is performed on client side or proxy side? Let's explore how it works.

The test in this post is done by Chrome 107.0.5304.107, SwitchyOmega with HTTP proxy.

阅读全文 »

Chrome加代理使用本地DNS还是远程DNS？

发表于 2022-11-27 分类于 Networking 评论：阅读次数：

如果在Chrome中使用SwitchyOmega插件进行网络代理，那么DNS是走的本地DNS还是代理服务器的DNS？我们来验证一下。

本测试使用 Chrome 107.0.5304.107，SwitchyOmega中的HTTP代理。

阅读全文 »

ENOTDIR: not a directory, open package.json Solution

发表于 2022-11-16 更新于 2023-01-24 分类于 Linux 评论：阅读次数：

When I install elasticdump, the following error appears:

$ npm install elasticdump
...
npm WARN @1.0.0 No description
npm WARN @1.0.0 No repository field.
npm ERR! Linux 5.4.0-1091-azure
npm ERR! argv "/usr/bin/node" "/usr/bin/npm" "install" "elasticdump"
npm ERR! node v8.10.0
npm ERR! npm  v3.5.2
npm ERR! path /home/finisky/node_modules/.staging/@types/node-1f2b596d/package.json
npm ERR! code ENOTDIR
npm ERR! errno -20
npm ERR! syscall open

npm ERR! ENOTDIR: not a directory, open '/home/finisky/node_modules/.staging/@types/node-1f2b596d/package.json'

阅读全文 »

Git reset --hard Not Working: File System Is Not Case Sensitive

发表于 2022-11-16 更新于 2023-01-24 分类于 Linux 评论：阅读次数：

git reset --hard not working: everytime you reset, the file is flipped between file.txt and File.txt, really weird...

It's not a joke, just clone this repo on Windows and you can reproduce it:

D:\$ git clone https://github.com/finisky/git-case-demo.git
Cloning into 'git-case-demo'...
remote: Enumerating objects: 11, done.
remote: Counting objects: 100% (11/11), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 11 (delta 0), reused 8 (delta 0), pack-reused 0
Unpacking objects: 100% (11/11), 1.85 KiB | 126.00 KiB/s, done.
warning: the following paths have collided (e.g. case-sensitive paths
on a case-insensitive filesystem) and only one from the same
colliding group is in the working tree:

  'File.txt'
  'file.txt'

After clone the repo, you will find that the main branch is not clean. git reset --hard not working:

阅读全文 »

Windows大小写不敏感导致的git冲突

发表于 2022-11-15 更新于 2022-11-16 分类于 Linux 评论：阅读次数：

来个好玩的，遇到过 git reset --hard 来回翻烧饼的事儿么？每reset一次，文件内容就更改一次，像鬼打墙一样。不信可以在Windows机器上clone下这个repo:

D:\$ git clone https://github.com/finisky/git-case-demo.git
Cloning into 'git-case-demo'...
remote: Enumerating objects: 11, done.
remote: Counting objects: 100% (11/11), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 11 (delta 0), reused 8 (delta 0), pack-reused 0
Unpacking objects: 100% (11/11), 1.85 KiB | 126.00 KiB/s, done.
warning: the following paths have collided (e.g. case-sensitive paths
on a case-insensitive filesystem) and only one from the same
colliding group is in the working tree:

  'File.txt'
  'file.txt'

然后就会发现刚拉的main分支都不干净(橙色)，而且git reset --hard也失效了，仔细看才发现，每reset一次，会在大写 File.txt 和小写 file.txt 之间切换，神奇不？

阅读全文 »

用301重定向迁移 GitHub Pages

发表于 2022-11-15 分类于 Hexo 评论：阅读次数：

在一篇老文 # 迁移Hexo博客到GitHub Pages 中提到:

GitHub Pages可能是个单行线，只能迁出，不好再迁出了。简单研究了一下文档，它不允许用户修改Server配置，所以看起来无法完成301重定向。

但在迁移博客时，301重定向必不可少，它是保证站点迁移不损失搜索排名的关键所在。具体来说，在迁移到新站点之后，需要手动修改 Google Search Console 的配置:

Do you lose credit for links when you redirect to new URLs?
No, 301 or 302 redirects do not cause a loss in PageRank

所以如何才是无损排名迁移 GitHub Pages 的正确方式？

阅读全文 »

Migrate GitHub Pages by 301 Redirects

发表于 2022-11-14 更新于 2023-01-24 分类于 Hexo 评论：阅读次数：

GitHub Pages cannot perform HTTP 301 redirects as you cannot modify the server config. However, 301 redirects is really crucial for SEO. In order to keep the site ranking, you need to 301 redirects the old GitHub Pages to your new site, and manually notify Google Search Console:

Do you lose credit for links when you redirect to new URLs?
No, 301 or 302 redirects do not cause a loss in PageRank

So how to migrate GitHub Pages to a new site without losing site ranking?

阅读全文 »

老板喜欢什么样的下属？

发表于 2022-11-13 分类于 Career 评论：阅读次数：

今天来聊聊老板喜欢什么样的下属。这个问题其实因老板而异，工作这些年见过各种风格的老板，但不论什么风格，靠谱和有能力的下属都是重点培养对象。我们先从老板的角度，简单分析下作为下属可以从哪些方面提升这些能力。

阅读全文 »