This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

MorfFlex CZ 160310

Please use the following text to cite this item or export to a predefined format:
Hajič, Jan and Hlaváčová, Jaroslava, 2016, MorfFlex CZ 160310, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-1673.
Date issued
2016-03-10
Size
124259099 lexicalTypes
Language(s)
Description
Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. Currently it contains full morphological information for each covered wordform, as well as some derivational, semantic and named entity information.
This item isPublicly Available
and licensed under:
 Files in this item
Name
morfflex-cz.2016-03-10.utf8.lemmaID_suff-tag-form.tab.csv.xz
Size
198.87 MB
Format
application/x-xz
Description
Full (morphologically analyzed) wordlist for Czech language, with lemma (which includes sense suffix (-<number>) and semantic/synt. suffixes and comments in PDT format, full positional tag in PDT format, and form (3 fields). Fields are tab separated, always filled by non-empty string, lines end with linefeed only, and coding is UTF-8.
MD5
ddf87e245c7215c8528443ba793223ce
Preview
  File Preview
    • morfflex-cz.2016-03-10.utf8.lemmaID_suff-tag-form.tab.csv6 GB
Name
morfflex-cz.2016-03-10.utf8.conll09.tab.csv.xz
Size
281.88 MB
Format
application/x-xz
Description
Full (morphologically analyzed) wordlist for Czech language, with form, lemma (without sense suffix and without semantic/synt. suffixes), CoNLL-2009 Shared Task format major POS and CoNLL-2009 Shared Task Word Features. Fields are tab separated, always filles by non-empty string, lines end with linefeed only, and coding is UTF-8.
MD5
e97fa4bc0053f0dfa073a243dc756666
Preview
  File Preview
    • morfflex-cz.2016-03-10.utf8.conll09.tab.csv8 GB