Transcribed narrative interviews with people from East and West Berlin about the events of November 9. 282,000 tokens. TEI XML, lemma and POS. Normalized version also available.
German reference corpus. Ca 100 million words, 20th Century. Searchable online. Part of 'Digitales Wörterbuch der deutschen Sprache des 20. Jahrhunderts' project; Korpus der BBAW; Grundlage des DWDS
Written German from 1920-39. 500,000 tokens, 392 texts. POS and lemma, TEI XML. Part of Das digitale Wörterbuch der deutschen Sprache der 20. Jahrhunderts
Articles from the 'Berliner Zeitung' online edition from 3.1.1994 to 31.12.2005. About 252 million tokens in 869,000 articles. Part of the DWDS project.
The C4 corpus is a joined effort of the project Digitales Wörterbuch der deutschen Sprache (DWDS), the Austrian Academy Corpus (AAC), the Korpus Südtirol and the Schweizer Textkorpus (CHTK). The Corpus is composed of corpora of all four partner institutions.
Corpus of the weekly Die Zeit from 1946 - present day (complete runs from 1996). Over 100 million words in 200,000 articles. Updated daily. Part of DWDS project.