Zobrazit minimální záznam

 
dc.contributor.author Bejček, Eduard
dc.date.accessioned 2018-01-23T09:31:44Z
dc.date.available 2018-01-23T09:31:44Z
dc.date.issued 2017
dc.identifier.uri http://hdl.handle.net/11234/1-2603
dc.description Lexicon of Czech verbal multiword expressions (VMWEs) used in Parseme Shared Task 2017. https://typo.uni-konstanz.de/parseme/index.php/2-general/142-parseme-shared-task-on-automatic-detection-of-verbal-mwes Lexicon consists of 4785 VMWEs, categorized into four categories according to Parseme Shared Task (PST) typology: IReflV (inherently reflexive verbs), LVC (light verb constructions), ID (idiomatic expressions) and OTH (other VMWEs with other than verbal syntactic head). Verbal multiword expressions as well as deverbative variants of VMWEs were annotated during the preparation phase of PST. These data were published as http://hdl.handle.net/11372/LRT-2282. Czech part includes 14,536 VMWE occurences: 1611 ID 10000 IReflV 2923 LVC 2 OTH This lexicon was created out of Czech data. Each lexicon entry is represented by one line in the form: type lemmas frequency PoS [used form 1; used form 2; ... ] (columns are separated by tabs) where: type ... is the type of VMWE in PST typology lemmas ... are space separated lemmatized forms of all words that constitutes the VMWE frequency ... is the absolute frequency of this item in PST data PoS ... is a space separated list of parts of speech of individual words (in the same order as in "lemmas") final field contains a list of all (1 to 18) used forms found in the data (since Czech is a flective language).
dc.language.iso ces
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation.isreferencedby http://ceur-ws.org/Vol-1779/02bejeck.pdf
dc.rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subject lexicon
dc.subject verbs
dc.subject multiword expressions
dc.subject forms
dc.subject lemmatization
dc.title Czech Verbal MWEs
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.mediaType text
metashare.ResourceInfo#ContentInfo.detailedType lexicon
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Eduard Bejček bejcek@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
sponsor Institute of Formal and Applied Linguistics LD14117 Parseme CZ nationalFunds
size.info 4785 items
files.size 320985
files.count 1


 Soubory tohoto záznamu

Icon
Název
lexicon_czech_VMWEs.tsv
Velikost
313.46 KB
Formát
Neznámý
Popis
lexicon
MD5
65c5fa7b9391f8e33fbb81c8f42c1d15
 Stáhnout soubor

Zobrazit minimální záznam