Czech OOV Inflection Dataset
Please use the following text to cite this item or export to a predefined format:
Sourada, Tomáš, 2024,
Czech OOV Inflection Dataset, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-5471.
Authors
Item identifier
Project URL
Referenced by
Date issued
2024
Size
6270880 entries
Language(s)
Description
Czech OOV Inflection Dataset is a Czech inflection dataset of nouns, focused on evaluation in out-of-vocabulary (OOV) conditions. It consists of two parts: a standard lemma-disjoint train-dev-test split of a subset of noun paradigms of existing morphological dictionary Czech MorfFlex 2.0 (files train, dev and test-MorfFlex); and small set of neologisms from Čeština 2.0, annotated for inflected forms (file test-neologisms).
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- CzechOOVInflectionDataset.tar.xz
- Size
- 17.08 MB
- Format
- application/x-xz
- Description
- xz Archive
- MD5
- f768e0166d0e81535e8afb2555d3eca3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

