no code implementations • LREC 2014 • Darina Benikova, Chris Biemann, Marc Reznicek
We describe our approach to creating annotation guidelines based on linguistic and semantic considerations, and how we iteratively refined and tested them in the early stages of annotation in order to arrive at the largest publicly available dataset for German NER, consisting of over 31, 000 manually annotated sentences (over 591, 000 tokens) from German Wikipedia and German online news.