# Introduction

Unfortunately, a major aspect missing from the current state of the art is that human conversations can take place over long time frames, whereas the currently used systems suffer in this setting.

Standard Transformers have a fixed context length which due to the all-vs-all self-attention mechanism becomes inefficient when it is too large.

# Multi-Session Chat 数据集

Eachchat session consists of 6-7 turns for each speaker.Then, after a certain amount of (simulated) time has transpired, typically hours or days, the speakers resume chatting, either continuing to talk about the previous subject, bringing up some other subject from their past shared history, or sparking upconversation on a new topic.

• session 1: 使用PersonaChat数据集。
• session 2/3/4: 假设距session 1已经过去了1 ~ 7小时或 1 ~ 7天，然后两人再次进行对话 (reengage)。worker被要求与另一个worker聊6轮，且要考虑到在之前session中聊天的内容，就是说不仅要考虑自己当前的人设，也要考虑之前两人交互的细节。

As these summaries were collected in order to store the important points pertinent to either one or the other speaker, they can also be seen to function as extensions of the original given personas.

# 模型

Baseline 就是 Transformer Encoder-Decoders，用个预训练语言模型即可，本文使用的是 BlenderBot 的 BST 2.7B 参数模型。

1. there is a lot of context to store, and hence retrieve from
2. no processing has been done on that content, so the reading, retrieving and combining operations required to generate an answer leave a lot of work for the model to do.