WebBrain-Raw is a large-scale dataset built from English Wikipedia articles and their crawlable Wikipedia references. It comprises 153 zipped data chunks in which each line is a Wikipedia page with its reference articles.
Source: WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web CorpusPaper | Code | Results | Date | Stars |
---|