This is not the latest version of this item. The latest version can be found here.
Derinet 2.2
Please use the following text to cite this item or export to a predefined format:
Svoboda, Emil; Vidra, Jonáš; Ševčíková, Magda and Žabokrtský, Zdeněk, 2024,
Derinet 2.2, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-5538.
Authors
Item identifier
Project URL
Date issued
2024-06-25
Size
1039842 words,
1039842 entries
Language(s)
Description
DeriNet is a lexical network which models derivational and compositional relations in the lexicon of Czech. Nodes of the network correspond to Czech lexemes, while edges represent word-formational relations between a derived word and its base word / words.
The present version, DeriNet 2.2, contains:
- 1,040,127 lexemes (sampled from the MorfFlex CZ 2.0 dictionary), connected by
- 782,904 derivational,
- 50,511 orthographic variant,
- 6,336 compounding,
- 288 univerbation, and
- 135 conversion relations.
Compared to the previous version, version 2.1 contains an overhaul of the compounding annotation scheme, 4384 extra compounds, 83 more affixoid lexemes serving as bases for compounding, more parts of speech serving as bases for compounding (adverbs, pronouns, numerals), and several minor corrections of derivational relations.
Subject(s)
Collections
Version History
This item isPublicly Available
and licensed under:
Files in this item
- Name
- derinet-2-2.tsv
- Size
- 428.53 MB
- Format
- application/octet-stream
- Description
- DeriNet 2.2
- MD5
- c094f4270fdef364e52cad9854bb3a03

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

