WDC-Dialogue is a dataset built from the Chinese social media to train EVA. Specifically, conversations from various sources are gathered and a rigorous data cleaning pipeline is designed to enforce the quality of WDC-Dialogue.
The dataset mainly focuses on three categories of textual interaction data, i.e., repost on social media, comment / reply on various online forums and online question and answer (Q&A) exchanges. Each round of these textual interactions yields a dialogue session via well-designed parsing rules.
Paper | Code | Results | Date | Stars |
---|