Zobrazit minimální záznam

 
dc.contributor.author Vidra, Jonáš
dc.contributor.author Žabokrtský, Zdeněk
dc.contributor.author Kyjánek, Lukáš
dc.contributor.author Ševčíková, Magda
dc.contributor.author Dohnalová, Šárka
dc.contributor.author Svoboda, Emil
dc.contributor.author Bodnár, Jan
dc.date.accessioned 2021-08-23T15:22:21Z
dc.date.available 2021-08-23T15:22:21Z
dc.date.issued 2021-07-25
dc.identifier.uri http://hdl.handle.net/11234/1-3765
dc.description DeriNet is a lexical network which models derivational relations in the lexicon of Czech. Nodes of the network correspond to Czech lexemes, while edges represent word-formational relations between a derived word and its base word / words. The present version, DeriNet 2.1, contains 1,039,012 lexemes (sampled from the MorfFlex CZ 2.0 ​dictionary) connected by 782,814 derivational, 50,533 orthographic variant, 1,952 compounding, 295 univerbation and 144 conversion relations. Compared to the previous version, version 2.1 contains annotations of orthographic variants, full automatically generated annotation of affix morpheme boundaries (in addition to the roots annotated in 2.0), 202 affixoid lexemes serving as bases for compounding, annotation of corpus frequency of lexemes, annotation of verbal conjugation classes and a pilot annotation of univerbation. The set of part-of-speech tags was converted to Universal POS from the Universal Dependencies project.
dc.language.iso ces
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation.replaces http://hdl.handle.net/11234/1-2995
dc.rights Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/
dc.source.uri https://ufal.mff.cuni.cz/derinet
dc.subject DeriNet
dc.subject derivation
dc.subject derivational morphology
dc.subject lexical network
dc.subject MorfFlex
dc.title DeriNet 2.1
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.mediaType text
metashare.ResourceInfo#ContentInfo.detailedType wordnet
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
demo.uri https://quest.ms.mff.cuni.cz/derisearch2/v2/databases/
contact.person Jonáš Vidra vidra@ufal.mff.cuni.cz Charles University in Prague, ÚFAL
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LM2015071 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds
sponsor Grantová agentura České Republiky 19-14534S Popis slovotvorné struktury českých slov na základě jazykových dat nationalFunds
sponsor Charles University Grant Agency 1176219 Developing derivational networks for multiple languages nationalFunds
sponsor Charles University START/HUM/010 A data-based approach to competition in word-formation: selected semantic categories across seven languages nationalFunds
size.info 1039012 entries
size.info 1034354 words
files.size 446734613
files.count 1


 Soubory tohoto záznamu

Licenční kategorie:
Publicly Available

Licence: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Distributed under Creative Commons Attribution Required Noncommercial Share Alike
Icon
Název
derinet-2-1.tsv
Velikost
426.04 MB
Formát
Neznámý
Popis
A tab-separated-values version of DeriNet 2.1, encoded as UTF-8, with Unix line endings. See the project homepage for documentation of the columns.
MD5
b2f4629c135ffc629e1c08eaeddf43d0
 Stáhnout soubor

Zobrazit minimální záznam