This is not the latest version of this item. The latest version can be found here.
MorfFlex CZ 160310
Please use the following text to cite this item or export to a predefined format:
Hajič, Jan and Hlaváčová, Jaroslava, 2016,
MorfFlex CZ 160310, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-1673.
Authors
Item identifier
Project URL
Date issued
2016-03-10
Size
124259099 lexicalTypes
Language(s)
Description
Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. Currently it contains full morphological information for each covered wordform, as well as some derivational, semantic and named entity information.
Subject(s)
Collections
Version History
This item isPublicly Available
and licensed under:
Files in this item
- Name
- morfflex-cz.2016-03-10.utf8.lemmaID_suff-tag-form.tab.csv.xz
- Size
- 198.87 MB
- Format
- application/x-xz
- Description
- Full (morphologically analyzed) wordlist for Czech language, with lemma (which includes sense suffix (-<number>) and semantic/synt. suffixes and comments in PDT format, full positional tag in PDT format, and form (3 fields). Fields are tab separated, always filled by non-empty string, lines end with linefeed only, and coding is UTF-8.
- MD5
- ddf87e245c7215c8528443ba793223ce

- Name
- morfflex-cz.2016-03-10.utf8.conll09.tab.csv.xz
- Size
- 281.88 MB
- Format
- application/x-xz
- Description
- Full (morphologically analyzed) wordlist for Czech language, with form, lemma (without sense suffix and without semantic/synt. suffixes), CoNLL-2009 Shared Task format major POS and CoNLL-2009 Shared Task Word Features. Fields are tab separated, always filles by non-empty string, lines end with linefeed only, and coding is UTF-8.
- MD5
- e97fa4bc0053f0dfa073a243dc756666


