This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

BulTreeBank Morphosyntactic Corpus

Please use the following text to cite this item or export to a predefined format:
2014, BulTreeBank Morphosyntactic Corpus, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11372/LRT-221.
Date issued
2014-07-30
Type
Language(s)
Description
Written, synchronic, general, manually annotated, 1 000 000 tokens divided in three sets: 215 000 tokens used in BulTreeBank HPSG Treebank (see below), additionally 300 000 checked second time, rest about 480 000 checked by the annotators. Morphosyntactic annotation with the BulTreeBank Tagset (http://www.bultreebank.org/TechRep/BTB-TR03.pdf), XML, annotation description in technical reports of BulTreeBank project http://www.bultreebank.org/TechRep