Human Label Variation in Attribution and Discourse (Hlava AD)
Please use the following text to cite this item or export to a predefined format:
Zikánová, Šárka; et al., 2024,
Human Label Variation in Attribution and Discourse (Hlava AD), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-5819.
Authors
Zikánová, Šárka ; et al.
Item identifier
Project URL
Date issued
2024-12-13
Size
512 entries
Language(s)
Description
Human Label Variation in Attribution and Discourse (Hlava AD) is a collection of commented multiple annotations (5 annotators) of inter-sentential explicit discourse relations between complex sentences containing verbs of attribution (saying, thinking) and following sentences in Czech. The main aim of the annotation is to capture how often the following sentence is seen as a follow-up of the direct/reported speech OR the author's speech. The dataset contains fillers (complex sentences with other types of verbs).
Please visit https://ufal.mff.cuni.cz/hvar/hlava-ad for detailed and updated information about the corpus.
Acknowledgement
Grantová agentura České republiky
Project code:24-11132S
Project name:Disagreement in corpus annotation and variation in human understanding of text
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- README.TXT
- Size
- 9.59 KB
- Format
- text/plain
- Description
- Text
- MD5
- 256417db1cbd9d8070d711916abe10a5

============================================================= Human Label Variation in Attribution and Discourse (Hlava AD) ============================================================= Authors ======= Šárka Zikánová (Charles University, Faculty of Mathematics and Physics), Jiří Mírovský (Charles University, Faculty of Mathematics and Physics), Anna Nedoluzhko (Charles University, Faculty of Mathematics and Physics) Eva Hajičová (Charles University, Faculty of Mathematics and Physics), Šárka Dohnalová (Charles University, Faculty of Arts) Anna Kmječová Eliška Nodlová (Charles University, Faculty of Arts) Dominik Teska (Charles University, Faculty of Arts) Introduction ============ Human Label Variation in Attribution and Discourse (Hlava AD) is a collection of commented multiple annotations (5 annotators) of inter-sentential explicit discourse relations between complex sentences containing verbs of attribution (saying, thinking) and following sentences in Czech. The main aim of the annotation is to capture how often the following sentence is seen as a follow-up of the direct/reported speech OR the author's speech. The dataset contains fillers (complex sentences with other types of verbs). Please visit https://ufal.mff.cuni.cz/hvar/hlava-ad for detailed and updated information about the corpus. Data Description ================ Hlava AD comprises 512 sentence pairs (221 attributions, 291 fillers), each accompanied by the preceding context. Each pair is annotated independently by five annotators in parallel. The annotators' decisions are documented in the "Commentary" column. Additionally, each annotator indicates whether they required the previous context to make their annotation. The texts come from the Prague Dependency Treebank - Consolidated 1.0 (Hajič et al., 2020). Data Source and Format ====================== After downloading the corpus from http://hdl.handle.net/11234/1-5819, the annotations can be found in directory data. The annotation . . .


