This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

MorfFlex CZ 2.1 (2024-12-23)

Pro citování této položky použijte následující text nebo ji exportujte do předdefinovaného formátu:
Hajič, Jan; Hlaváčová, Jaroslava; Mikulová, Marie; Straka, Milan and Štěpánková, Barbora, 2024, MorfFlex CZ 2.1 (2024-12-23), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-5833.
Datum
2024-12-23
Velikost
126906921 entries
Jazyky
Popis
MorfFlex CZ 2.1 is the Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. MorfFlex CZ 2.1 is a part of the PDT-C 2.0 release https://hdl.handle.net/11234/1-5813. It is a minor upgrade from MorfFlex CZ 2.0, with the tagset unchanged, but with some additions and corrections for full compatibility with PDT-C 2.0 morphological annotation. MorfFlex is a flat list of lemma-tag-wordform triples. For each wordform, full inflectional information is coded in a positional tag. Wordforms are organized into entries (paradigm instances or paradigms in short) according to their formal morphological behavior. The paradigm (set of wordforms) is identified by a unique lemma. Apart from traditional morphological categories, the description also contains some semantic, stylistic and derivational information. For more details see a comprehensive specification of the Czech morphological annotation https://ufal.mff.cuni.cz/techrep/tr64.pdf .
Sponzoři
Tento záznam jePublicly Available
a je licencován pod licencí:
 Soubory tohoto záznamu
Název
czech-morfflex-2.1.tsv.xz
Velikost
238.88 MB
Formát
application/x-xz
Popis
Morphological dictionary of Czech language, consisting of triples lemma (which includes sense suffix (-<number>) and semantic/synt. suffixes and comments in PDT-C format), full positional tag in PDT-C format, and form. Fields are tab separated, always filled by non-empty string, lines end with linefeed only, and coding is UTF-8.
MD5
76b4753ab291d53f05a7139596d0be72
Preview
  Náhled souboru
    • czech-morfflex-2.1.tsv6 GB
    • czech-morfflex-2.1.tsv6 GB