Diakorp v6: diachronic corpus of Czech
Please use the following text to cite this item or export to a predefined format:
Kučera, Karel; Řehořková, Anna and Stluka, Martin, 2015,
Diakorp v6: diachronic corpus of Czech, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-5413.
Authors
Item identifier
Project URL
Date issued
2015-12-18
Size
3450000 words
Language(s)
Description
Diachronic corpus of Czech sized 3.45 million words (i.e. 4.1 million tokens). It contains 116 texts from the 14th-20th century period. The texts are transcribed, not transliterated. Diakorp v6 is provided in a CoNLL-U-like vertical format used as an input to the Manatee query engine. The data thus correspond to the corpus available via the KonText query interface to the registered users of CNC at http://www.korpus.cz
Acknowledgement
Ministerstvo školství, mládeže a tělovýchovy
Project code:LM2011023
Project name:Český národní korpus
Subject(s)
Collections
This item isPublicly Available
and licensed under:


