This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

Human Label Variation in Coreference (Hlava Cor)

Please use the following text to cite this item or export to a predefined format:
Nedoluzhko, Anna; et al., 2026, Human Label Variation in Coreference (Hlava Cor), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-6131.
Date issued
2026-03-31
Size
1024 entries
Language(s)
Description
Human Label Variation in Coreference (Hlava COR) is a collection of commented multiple annotations (three annotators) of coreferential relations in Czech, i.e. the annotation of expressions that refer to the same extra-linguistic entity, concept, or situation. Given an anaphoric expression, annotators were instructed to identify a coreferential expression in the preceding context (if one exists) and to comment on their decision. The main aim of the annotation is to capture variation in the interpretation of coreference among readers. The dataset includes both written and spoken contexts. For detailed and up-to-date information about the corpus, please visit: https://ufal.mff.cuni.cz/hvar/hlava-cor
Acknowledgement
 Files in this item
Name
Hlava_Cor.zip
Size
3.48 MB
Format
application/zip
Description
Hlava Cor distribution
MD5
ac6b0085bc2db9310e12ade4b78ad7fd
Preview
  File Preview
  • Hlava_Cor
    • README.TXT7 kB
    • data
      • Hlava_Cor.odb1 MB
      • Hlava_Cor.tsv1 MB
      • Hlava_Cor.xlsx790 kB
      • Hlava_Cor.ods853 kB
    • doc
      • annotation_guidelines.pdf172 kB
Name
README.TXT
Size
7.97 KB
Format
text/plain
Description
Hlava Cor README
MD5
b4fc2596ec4a8c4d98442c1a5eda9f72
Preview
  File Preview
    ================================================
    Human Label Variation in Coreference (Hlava Cor)
    ================================================
    
    
    Authors
    =======
    
    Anna Nedoluzhko (Charles University, Faculty of Mathematics and Physics)
    Jiří Mírovský (Charles University, Faculty of Mathematics and Physics)
    Šárka Zikánová (Charles University, Faculty of Mathematics and Physics)
    Eva Hajičová (Charles University, Faculty of Mathematics and Physics)
    Bianca Chuffartová 
    Šárka Dohnalová (Charles University, Faculty of Arts)
    Lucie Hartmanová (Charles University, Faculty of Arts)  
    Eliška Nodlová (Charles University, Faculty of Arts)
    Dominik Teska (Charles University, Faculty of Arts)
    Františka Zikánová
    
    
    Introduction
    ============
    
    Human Label Variation in Coreference (Hlava Cor) is a collection
    of commented multiple annotations (three annotators) of coreferential
    relations in Czech, i.e. the annotation of expressions that refer
    to the same extra-linguistic entity, concept, or situation.
    Given an anaphoric expression, annotators were instructed to identify
    a coreferential expression in the preceding context (if one exists)
    and to comment on their decision. The main aim of the annotation
    is to capture variation in the interpretation of coreference among
    readers. The dataset includes both written and spoken contexts.
    For detailed and up-to-date information about the corpus,
    please visit: https://ufal.mff.cuni.cz/hvar/hlava-cor
    
    
    Data Description
    ================
    
    Hlava Cor comprises 1,024 cases (a sentence + adjacent and distant
    contexts), each annotated independently by three annotators
    in parallel. The annotators' decisions are documented
    in the "Commentary" columns.
    
    The texts come from three sources:
    - the Prague Dependency Treebank - Consolidated 1.0
      (Hajič et al., 2020) (this is the main resource)
    - Czech Academic Corpus 2.0 (Hladká et al., 2008)
    - iRozhlas archive (https://www.irozhlas.cz/zpravy-archiv)
    
    
    Data Source and Format
    ======================
    
    After d . . .