Prague Czech-English Dependency Treebank 2.0 - Russian translation
Please use the following text to cite this item or export to a predefined format:
Novák, Michal; Nedoluzhko, Anna and Schwarz (Khoroshkina), Anna, 2016,
Prague Czech-English Dependency Treebank 2.0 - Russian translation, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-1791.
Authors
Item identifier
Date issued
2016-09-30
Size
1127 sentences
Description
Prague Czech-English Dependency Treebank - Russian translation (PCEDT-R) is a project of translating a subset of Prague Czech-English Dependency Treebank 2.0 (PCEDT 2.0) to Russian and linguistically annotating the Russian translations with emphasis on coreference and cross-lingual alignment of coreferential expressions. Cross-lingual comparison of coreference means is currently the purpose that drives development of this corpus.
The current version 0.5 is a preliminary version, which contains (+ denotes new features):
* complete PCEDT 2.0 documents "wsj_1900"-"wsj_1949"
* Czech-English word alignment of coreferential expressions annotated manually mainly on the t-layer
+ Russian translations of the original English sentences
+ automatic tokenization, part-of-speech tagging and morphological analysis for Russian
+ automatic word alignment between all Czech and Russian words
+ manual alignment between Russian and the other two languages on possessive pronouns
Acknowledgement
Czech Science Foundation
Project code:GA 16-05394S
Project name:Structure of coreferential chains in parallel language data
The Charles University Grant Agency
Project code:GAUK 338915
Project name:Cross-lingual approaches to coreference resolution
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LH14011
Project name:Vícejazyčná korpusová anotace
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2015071
Project name:LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat
Univerzita Karlova (mimo GAUK)
Project code:SVV 260 333
Project name:Specifický vysokoškolský výzkum
Subject(s)
Collections
Files in this item
- Name
- pcedt-r.zip
- Size
- 6.88 MB
- Format
- application/zip
- Description
- data
- MD5
- 6022c87f5ecd29e457341438fae73166

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

