Show simple item record

 
dc.contributor.author Poláková, Lucie
dc.contributor.author Zikánová, Šárka
dc.contributor.author Mírovský, Jiří
dc.contributor.author Hajičová, Eva
dc.date.accessioned 2023-07-03T09:26:36Z
dc.date.available 2023-07-03T09:26:36Z
dc.date.issued 2023-06-30
dc.identifier.uri http://hdl.handle.net/11234/1-5174
dc.description The Czech RST Discourse Treebank 1.0 (CzRST-DT 1.0) is a dataset of 54 Czech journalistic texts manually annotated using the Rhetorical Structure Theory (RST). Each text document in the treebank is represented as a single tree-like structure, the nodes (discourse units) are interconnected through hierarchical rhetorical relations. The dataset also contains concurrent annotations of five double-annotated documents. The original texts are a part of the data annotated in the Prague Dependency Treebank, although the two projects are independent.
dc.language.iso ces
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.source.uri https://ufal.mff.cuni.cz/czrst-dt1.0
dc.subject discourse
dc.subject discourse annotation
dc.subject annotated corpus
dc.title Czech RST Discourse Treebank 1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Lucie Poláková polakova@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
sponsor The Grant Agency of the Czech Republic 20-09853S Global Coherence of Czech Texts in the Corpus-Based Perspective nationalFunds
size.info 54 articles
size.info 901 sentences
size.info 14514 tokens
files.size 212249
files.count 2


 Files in this item

 Download all files in item (207.27 KB)
Icon
Name
CzRST-DT_1.0.zip
Size
203.38 KB
Format
application/zip
Description
CzRST-DT 1.0 distribution
MD5
93b2a2beab1ff13f7dd652fa5de74bfb
 Download file  Preview
 File Preview  
  • CzRST-DT_1.0
    • README.TXT3 kB
    • data
      • IAA
        • edited_to_one_tree
          • ANNOT2
            • ln94203_145_one_tree.rs37 kB
            • mf930713_055_one_tree.rs36 kB
            • ln94203_43_one_tree.rs34 kB
            • ln94202_135_one_tree.rs34 kB
            • cmpr9415_032_one_tree.rs34 kB
          • ANNOT1
            • ln94203_145_one_tree.rs37 kB
            • mf930713_055_one_tree.rs36 kB
            • ln94202_135_one_tree.rs34 kB
            • ln94203_43_one_tree.rs34 kB
            • cmpr9415_032_one_tree.rs34 kB
        • original_annotations
          • ANNOT2
            • mf930713_055.rs36 kB
            • ln94203_43.rs34 kB
            • cmpr9415_032.rs34 kB
            • ln94203_145.rs37 kB
            • ln94202_135.rs34 kB
          • ANNOT1
            • mf930713_055.rs36 kB
            • cmpr9415_032.rs34 kB
            • ln94203_43.rs34 kB
            • ln94203_145.rs37 kB
            • ln94202_135.rs34 kB
      • RS3
        • ln94207_39.rs33 kB
        • mf920925_021.rs34 kB
        • lnd94103_003.rs32 kB
        • lnd94103_063.rs311 kB
        • cmpr9413_017.rs37 kB
        • ln95048_056.rs38 kB
        • ln95049_086.rs35 kB
        • ln94202_49.rs33 kB
        • ln94200_8.rs32 kB
        • mf930713_099.rs35 kB
        • ln94207_83.rs311 kB
        • mf930713_055.rs36 kB
        • ln95047_134.rs35 kB
        • ln94200_112.rs35 kB
        • ln95048_140.rs34 kB
        • ln94203_145.rs37 kB
        • ln94202_135.rs34 kB
        • mf920922_138.rs33 kB
        • cmpr9415_032.rs34 kB
        • cmpr9410_047.rs312 kB
        • ln94200_84.rs33 kB
        • ln95048_055.rs34 kB
        • ln94207_54.rs311 kB
        • cmpr9413_004.rs34 kB
        • lnd94103_129.rs33 kB
        • ln94200_167.rs33 kB
        • mf920925_087.rs34 kB
        • mf930713_110.rs311 kB
        • lnd94103_013.rs34 kB
        • lnd94103_145.rs36 kB
        • lnd94103_033.rs33 kB
        • ln95048_058.rs35 kB
        • mf930709_087.rs38 kB
        • mf920922_105.rs39 kB
        • mf920925_018.rs35 kB
        • ln94203_100.rs35 kB
        • lnd94103_053.rs33 kB
        • ln95049_100.rs38 kB
        • ln94210_147.rs37 kB
        • mf920925_114.rs34 kB
        • mf930709_083.rs33 kB
        • ln95048_122.rs33 kB
        • ln95049_019.rs33 kB
        • mf920922_133.rs33 kB
        • mf930713_013.rs34 kB
        • ln94209_45.rs39 kB
        • mf930709_058.rs34 kB
        • ln94207_16.rs38 kB
        • ln94203_43.rs34 kB
        • ln94200_170.rs311 kB
        • cmpr9413_026.rs33 kB
        • ln94208_143.rs33 kB
        • cmpr9413_034.rs38 kB
        • ln94206_47.rs37 kB
      • TXT
        • lnd94103_013.txt1 kB
        • lnd94103_145.txt1 kB
        • lnd94103_033.txt733 B
        • mf930709_087.txt2 kB
        • ln95048_058.txt1 kB
        • mf920922_105.txt2 kB
        • mf920925_018.txt1 kB
        • ln94203_100.txt1 kB
        • lnd94103_053.txt1008 B
        • ln95049_100.txt1 kB
        • ln94210_147.txt2 kB
        • mf930709_083.txt944 B
        • mf920925_114.txt997 B
        • ln95048_122.txt622 B
        • ln95049_019.txt821 B
        • mf930713_013.txt1 kB
        • mf920922_133.txt595 B
        • ln94209_45.txt3 kB
        • mf930709_058.txt883 B
        • ln94207_16.txt2 kB
        • ln94203_43.txt1 kB
        • ln94200_170.txt3 kB
        • cmpr9413_026.txt709 B
        • ln94208_143.txt1 kB
        • cmpr9413_034.txt2 kB
        • ln94206_47.txt1 kB
        • ln94207_39.txt940 B
        • lnd94103_003.txt684 B
        • mf920925_021.txt1 kB
        • lnd94103_063.txt4 kB
        • cmpr9413_017.txt2 kB
        • ln95049_086.txt1 kB
        • ln94202_49.txt909 B
        • ln95048_056.txt2 kB
        • ln94200_8.txt735 B
        • mf930713_099.txt967 B
        • ln94207_83.txt3 kB
        • mf930713_055.txt2 kB
        • ln95047_134.txt1 kB
        • ln94200_112.txt1 kB
        • ln95048_140.txt1 kB
        • ln94203_145.txt2 kB
        • ln94202_135.txt1 kB
        • mf920922_138.txt801 B
        • cmpr9415_032.txt1 kB
        • cmpr9410_047.txt4 kB
        • ln94200_84.txt1 kB
        • ln94207_54.txt3 kB
        • ln95048_055.txt960 B
        • cmpr9413_004.txt1 kB
        • lnd94103_129.txt568 B
        • mf920925_087.txt1 kB
        • ln94200_167.txt1 kB
        • mf930713_110.txt3 kB
Icon
Name
README.TXT
Size
3.9 KB
Format
Text file
Description
CzRST-DT 1.0 README
MD5
04ec03e6206fb2ba96141b2c1967eabc
 Download file  Preview
 File Preview  
===============================================
Czech RST Discourse Treebank 1.0 (CzRST-DT 1.0)
===============================================


Authors
=======
Lucie Poláková (Charles University, Faculty of Mathematics and Physics),
Šárka Zikánová (Charles University, Faculty of Mathematics and Physics),
Jiří Mírovský (Charles University, Faculty of Mathematics and Physics)
Eva Hajičová (Charles University, Faculty of Mathematics and Physics),


Introduction
============

The Czech RST Discourse Treebank 1.0 (CzRST-DT 1.0, Poláková et al., 2023)
is a dataset of 54 Czech journalistic texts manually annotated using
the Rhetorical Structure Theory (RST; Mann and Thompson, 1988).
Each text document in the treebank is represented as a single tree-like
structure, the nodes (discourse units) are interconnected through
hierarchical rhetorical relations.

The dataset also contains concurrent annotations of five double-annotated
documents.

The original texts are a part of the data annotated . . .
                                            

Show simple item record