Prague Dependency Treebank 3.0 (PDT 3.0)
Overview
The Prague Dependency Treebank 3.0 (PDT 3.0) annotates the same texts as the PDT 2.0 (Hajič et al. 2006), PDT 2.5 (Bejček et al. 2011), and the Prague Discourse Treebank 1.0 (PDiT 1.0, Poláková et al. 2012).
The annotation on the four layers was further fixed and improved in various aspects. Moreover, new information was added to the data:
- from PDT 2.0 to PDT 2.5
- Multiword expressions
- Pair/group meaning
- Clause segmentation
- from PDT 2.5 to PDiT 1.0
- Extended textual coreference
- Bridging anaphora
- Discourse relations marked by explicit connectives
- from PDiT 1.0 to PDT 3.0
- Revision of several grammatemes
- Revision of sentence modality annotation
- Replacement of t_lemma #Benef
- Genres of documents
- Pronominal textual coreference of 1st and 2nd person
- Updated discourse relations marked by explicit connectives
All the additional annotation (with the exception of clause segmentation) was performed on the tectogrammatical trees and technically is a part of the underlying syntax layer of the PDT. The annotation of clause segmentation was done on the analytical layer. Numerous errors were fixed across all layers of annotation.
Bejček, Eduard, Panevová, Jarmila, Popelka, Jan, Smejkalová, Lenka, Straňák, Pavel, Ševčíková, Magda, Štěpánek, Jan, Toman, Josef, Žabokrtský, Zdeněk, Hajič, Jan. 2011.
Prague Dependency Treebank 2.5. Data/software, Charles University in Prague, MFF, ÚFAL, Praha, Czechia, Dec 2011 (
http://ufal.mff.cuni.cz/pdt2.5/)
Hajič, Jan, Panevová, Jarmila, Hajičová, Eva, Sgall, Petr, Pajas, Petr, Štěpánek, Jan, Havelka, Jiří, Mikulová, Marie, Žabokrtský, Zdeněk, Ševčíková-Razímová Magda. 2006.
Prague Dependency Treebank 2.0. Software prototype, Linguistic Data Consortium, Philadelphia, PA, USA, ISBN 1-58563-370-4, www.ldc.upenn.edu, Jul 2006 (
http://ufal.mff.cuni.cz/pdt2.0/)
Poláková Lucie, Jínová Pavlína, Zikánová Šárka, Hajičová Eva, Mírovský Jiří, Nedoluzhko Anna, Rysová Magdaléna, Pavlíková Veronika, Zdeňková Jana, Pergler Jiří, Ocelák Radek. 2012.
Prague Discourse Treebank 1.0. Data/software, Charles University in Prague, MFF, ÚFAL, Praha, Czechia, Nov 2012 (
http://ufal.mff.cuni.cz/discourse/)