This is not the latest version of this item. The latest version can be found here.
DeriNet 2.1
Please use the following text to cite this item or export to a predefined format:
Vidra, Jonáš; et al., 2021,
DeriNet 2.1, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-3765.
Authors
Vidra, Jonáš ; et al.
Item identifier
Project URL
Date issued
2021-07-25
Size
1039012 entries,
1034354 words
Language(s)
Description
DeriNet is a lexical network which models derivational relations in the lexicon of Czech. Nodes of the network correspond to Czech lexemes, while edges represent word-formational relations between a derived word and its base word / words. The present version, DeriNet 2.1, contains 1,039,012 lexemes (sampled from the MorfFlex CZ 2.0 dictionary) connected by 782,814 derivational, 50,533 orthographic variant, 1,952 compounding, 295 univerbation and 144 conversion relations.
Compared to the previous version, version 2.1 contains annotations of orthographic variants, full automatically generated annotation of affix morpheme boundaries (in addition to the roots annotated in 2.0), 202 affixoid lexemes serving as bases for compounding, annotation of corpus frequency of lexemes, annotation of verbal conjugation classes and a pilot annotation of univerbation. The set of part-of-speech tags was converted to Universal POS from the Universal Dependencies project.
Acknowledgement
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2015071
Project name:LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat
Grantová agentura České Republiky
Project code:19-14534S
Project name:Popis slovotvorné struktury českých slov na základě jazykových dat
Charles University Grant Agency
Project code:1176219
Project name:Developing derivational networks for multiple languages
Charles University
Project code:START/HUM/010
Project name:A data-based approach to competition in word-formation: selected semantic categories across seven languages
Collections
Version History
You are currently viewing version 6 of the item.
This item isPublicly Available
and licensed under:
Files in this item
- Name
- derinet-2-1.tsv
- Size
- 426.04 MB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- b2f4629c135ffc629e1c08eaeddf43d0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

