This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 

Czech Named Entity Corpus 2.0

Please use the following text to cite this item or export to a predefined format:
Ševčíková, Magda; Žabokrtský, Zdeněk; Straková, Jana and Straka, Milan, 2014, Czech Named Entity Corpus 2.0, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11858/00-097C-0000-0023-1B22-8.
Date issued
2014-01-09
Size
8993 sentences
Language(s)
Description
Czech Named Entity Corpus 2.0 is a corpus of 8993 Czech sentences with manually annotated 35220 Czech named entities, classified according to a two-level hierarchy of 46 named entities.
Acknowledgement
This item isPublicly Available
and licensed under:
 Files in this item
Name
Czech_Named_Entity_Corpus_2.0.zip
Size
13.29 MB
Format
application/zip
Description
Zip
MD5
e4962225af8aea82bdcb8ac9bdad6c3b
Preview
  File Preview
  • cnec2.0
    • LICENSE21 kB
    • README3 kB
    • data
      • xml
        • named_ent_train.xml1 MB
        • named_ent_etest.xml190 kB
        • named_ent_dtest.xml188 kB
        • named_ent.xml1 MB
      • html
        • named_ent_train.html1 MB
        • named_ent.html2 MB
        • named_ent_dtest.html247 kB
        • named_ent_etest.html249 kB
      • plain
        • named_ent_train.txt1 MB
        • named_ent_etest.txt137 kB
        • named_ent_dtest.txt137 kB
        • named_ent.txt1 MB
      • treex
        • named_ent.treex56 MB
        • named_ent_train.treex44 MB
        • named_ent_dtest.treex5 MB
        • named_ent_etest.treex5 MB
    • tools
      • statistics.pl509 B
      • Treex
      • compare_ne_outputs_v3.pl14 kB
      • namedent_annotations_to_html.pl3 kB
      • namedent_annotations_to_xml_simple.pl559 B
    • doc
      • techrep-ne-2007.pdf600 kB
      • doc.pdf162 kB
      • statistics.txt746 B
      • ne-type-hierarchy.pdf53 kB