This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 

DeriNet 2.1

Please use the following text to cite this item or export to a predefined format:
Vidra, Jonáš; et al., 2021, DeriNet 2.1, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-3765.
Date issued
2021-07-25
Size
1039012 entries,
1034354 words
Language(s)
Description
DeriNet is a lexical network which models derivational relations in the lexicon of Czech. Nodes of the network correspond to Czech lexemes, while edges represent word-formational relations between a derived word and its base word / words. The present version, DeriNet 2.1, contains 1,039,012 lexemes (sampled from the MorfFlex CZ 2.0 ​dictionary) connected by 782,814 derivational, 50,533 orthographic variant, 1,952 compounding, 295 univerbation and 144 conversion relations. Compared to the previous version, version 2.1 contains annotations of orthographic variants, full automatically generated annotation of affix morpheme boundaries (in addition to the roots annotated in 2.0), 202 affixoid lexemes serving as bases for compounding, annotation of corpus frequency of lexemes, annotation of verbal conjugation classes and a pilot annotation of univerbation. The set of part-of-speech tags was converted to Universal POS from the Universal Dependencies project.
Acknowledgement

Version History

Showing 1 - 8 out of 8 results
VersionDateSummary
2025-01-29 00:00:00
2024-06-25 00:00:00
6*
2021-07-25 00:00:00
2019-05-30 00:00:00
2018-09-24 00:00:00
2017-09-26 00:00:00
2016-10-20 00:00:00
2015-07-31 00:00:00
* Selected version
This item isPublicly Available
and licensed under:
 Files in this item
Name
derinet-2-1.tsv
Size
426.04 MB
Format
application/octet-stream
Description
Unknown
MD5
b2f4629c135ffc629e1c08eaeddf43d0
Preview
  File Preview