Human Label Variation in Coreference (Hlava Cor)
Please use the following text to cite this item or export to a predefined format:
Nedoluzhko, Anna; et al., 2026,
Human Label Variation in Coreference (Hlava Cor), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-6131.
Authors
Nedoluzhko, Anna ; et al.
Item identifier
Project URL
Date issued
2026-03-31
Size
1024 entries
Language(s)
Description
Human Label Variation in Coreference (Hlava COR) is a collection of commented multiple annotations (three annotators) of coreferential relations in Czech, i.e. the annotation of expressions that refer to the same extra-linguistic entity, concept, or situation. Given an anaphoric expression, annotators were instructed to identify a coreferential expression in the preceding context (if one exists) and to comment on their decision. The main aim of the annotation is to capture variation in the interpretation of coreference among readers. The dataset includes both written and spoken contexts. For detailed and up-to-date information about the corpus, please visit: https://ufal.mff.cuni.cz/hvar/hlava-cor
Acknowledgement
Grantová agentura České republiky
Project code:24-11132S
Project name:Disagreement in corpus annotation and variation in human understanding of text
Subject(s)
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- README.TXT
- Size
- 7.97 KB
- Format
- text/plain
- Description
- Hlava Cor README
- MD5
- b4fc2596ec4a8c4d98442c1a5eda9f72

================================================ Human Label Variation in Coreference (Hlava Cor) ================================================ Authors ======= Anna Nedoluzhko (Charles University, Faculty of Mathematics and Physics) Jiří Mírovský (Charles University, Faculty of Mathematics and Physics) Šárka Zikánová (Charles University, Faculty of Mathematics and Physics) Eva Hajičová (Charles University, Faculty of Mathematics and Physics) Bianca Chuffartová Šárka Dohnalová (Charles University, Faculty of Arts) Lucie Hartmanová (Charles University, Faculty of Arts) Eliška Nodlová (Charles University, Faculty of Arts) Dominik Teska (Charles University, Faculty of Arts) Františka Zikánová Introduction ============ Human Label Variation in Coreference (Hlava Cor) is a collection of commented multiple annotations (three annotators) of coreferential relations in Czech, i.e. the annotation of expressions that refer to the same extra-linguistic entity, concept, or situation. Given an anaphoric expression, annotators were instructed to identify a coreferential expression in the preceding context (if one exists) and to comment on their decision. The main aim of the annotation is to capture variation in the interpretation of coreference among readers. The dataset includes both written and spoken contexts. For detailed and up-to-date information about the corpus, please visit: https://ufal.mff.cuni.cz/hvar/hlava-cor Data Description ================ Hlava Cor comprises 1,024 cases (a sentence + adjacent and distant contexts), each annotated independently by three annotators in parallel. The annotators' decisions are documented in the "Commentary" columns. The texts come from three sources: - the Prague Dependency Treebank - Consolidated 1.0 (Hajič et al., 2020) (this is the main resource) - Czech Academic Corpus 2.0 (Hladká et al., 2008) - iRozhlas archive (https://www.irozhlas.cz/zpravy-archiv) Data Source and Format ====================== After d . . .


