This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 

Prague Czech-English Dependency Treebank 2.0 - Russian translation

Please use the following text to cite this item or export to a predefined format:
Novák, Michal; Nedoluzhko, Anna and Schwarz (Khoroshkina), Anna, 2016, Prague Czech-English Dependency Treebank 2.0 - Russian translation, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-1791.
Date issued
2016-09-30
Size
1127 sentences
Language(s)
Description
Prague Czech-English Dependency Treebank - Russian translation (PCEDT-R) is a project of translating a subset of Prague Czech-English Dependency Treebank 2.0 (PCEDT 2.0) to Russian and linguistically annotating the Russian translations with emphasis on coreference and cross-lingual alignment of coreferential expressions. Cross-lingual comparison of coreference means is currently the purpose that drives development of this corpus. The current version 0.5 is a preliminary version, which contains (+ denotes new features): * complete PCEDT 2.0 documents "wsj_1900"-"wsj_1949" * Czech-English word alignment of coreferential expressions annotated manually mainly on the t-layer + Russian translations of the original English sentences + automatic tokenization, part-of-speech tagging and morphological analysis for Russian + automatic word alignment between all Czech and Russian words + manual alignment between Russian and the other two languages on possessive pronouns
Acknowledgement
This item isRestricted Use
and licensed under:
 Files in this item
Name
pcedt-r.zip
Size
6.88 MB
Format
application/zip
Description
Zip
MD5
6022c87f5ecd29e457341438fae73166
Preview
  File Preview
  • pcedt-r
    • DOCUMENTATION10 kB
    • README3 kB
    • data
      • wsj_1921.treex.gz31 kB
      • wsj_1918.treex.gz48 kB
      • wsj_1926.treex.gz103 kB
      • wsj_1934.treex.gz139 kB
      • wsj_1942.treex.gz27 kB
      • wsj_1939.treex.gz292 kB
      • wsj_1947.treex.gz107 kB
      • wsj_1901.treex.gz60 kB
      • wsj_1906.treex.gz56 kB
      • wsj_1914.treex.gz45 kB
      • wsj_1922.treex.gz144 kB
      • wsj_1930.treex.gz161 kB
      • wsj_1919.treex.gz67 kB
      • wsj_1927.treex.gz284 kB
      • wsj_1935.treex.gz286 kB
      • wsj_1943.treex.gz95 kB
      • wsj_1948.treex.gz114 kB
      • wsj_1902.treex.gz43 kB
      • wsj_1910.treex.gz47 kB
      • wsj_1907.treex.gz46 kB
      • wsj_1915.treex.gz926 kB
      • wsj_1923.treex.gz72 kB
      • wsj_1931.treex.gz184 kB
      • wsj_1928.treex.gz315 kB
      • wsj_1936.treex.gz388 kB
      • wsj_1944.treex.gz176 kB
      • wsj_1949.treex.gz106 kB
      • wsj_1903.treex.gz281 kB
      • wsj_1911.treex.gz58 kB
      • wsj_1908.treex.gz34 kB
      • wsj_1916.treex.gz257 kB
      • wsj_1924.treex.gz217 kB
      • wsj_1932.treex.gz333 kB
      • wsj_1929.treex.gz287 kB
      • wsj_1940.treex.gz40 kB
      • wsj_1937.treex.gz294 kB
      • wsj_1945.treex.gz15 kB
      • wsj_1904.treex.gz42 kB
      • wsj_1912.treex.gz65 kB
      • wsj_1909.treex.gz41 kB
      • wsj_1920.treex.gz75 kB
      • wsj_1917.treex.gz54 kB
      • wsj_1925.treex.gz27 kB
      • wsj_1933.treex.gz20 kB
      • wsj_1941.treex.gz21 kB
      • wsj_1938.treex.gz59 kB
      • wsj_1946.treex.gz274 kB
      • wsj_1900.treex.gz48 kB
      • wsj_1905.treex.gz49 kB
      • wsj_1913.treex.gz52 kB
    • resources
      • treex_subschema_t_layer.xml26 kB
      • treex_subschema_bbn.xml3 kB
      • treex_subschema_w_layer.xml1 kB
      • treex_schema.xml4 kB
      • treex_subschema_langcodes.xml10 kB
      • treex_subschema_n_layer.xml1 kB
      • treex_subschema_interset.xml3 kB
      • treex_subschema_a_layer.xml11 kB
      • treex_subschema_p_layer.xml5 kB