Czech Named Entity Corpus 1.1
Please use the following text to cite this item or export to a predefined format:
Ševčíková, Magda; Žabokrtský, Zdeněk; Straková, Jana and Straka, Milan, 2014,
Czech Named Entity Corpus 1.1, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11858/00-097C-0000-0023-1B04-C.
Authors
Item identifier
Project URL
Date issued
2014-01-09
Size
5868 sentences
Language(s)
Description
Czech Named Entity Corpus 1.1 fixes some issues of the Czech Named Entity Corpus 1.0: misannotated entities are fixed, all formats contain the same data, tmt format is replaced with treex format, all formats contain splitting into training, development and testing portion of the data.
Acknowledgement
Univerzita Karlova v Praze (mimo GAUK)
Project code:SVV 267 314
Project name:Teoretické základy informatiky a výpočetní lingvistiky
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2010013
Project name:LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat
Grantová agentura České republiky
Project code:GPP406/12/P175
Project name:Vybrané derivační vztahy pro automatické zpracování češtiny
Univerzita Karlova v Praze (mimo GAUK)
Project code:PRVOUK
Project name:PRVOUK
Subject(s)
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- Czech_Named_Entity_Corpus_1.1.zip
- Size
- 10.48 MB
- Format
- application/zip
- Description
- Zip
- MD5
- 9457d49807c494a23a5f029f88fa09a6

- cnec1.1
- LICENSE21 kB
- README3 kB
- data
- xml
- named_ent_train.xml1 MB
- named_ent_etest.xml156 kB
- named_ent_dtest.xml153 kB
- named_ent.xml1 MB
- html
- named_ent_train.html1 MB
- named_ent.html1 MB
- named_ent_dtest.html207 kB
- named_ent_etest.html212 kB
- plain
- named_ent_train.txt835 kB
- named_ent_etest.txt106 kB
- named_ent_dtest.txt105 kB
- named_ent.txt1 MB
- treex
- named_ent.treex43 MB
- named_ent_train.treex34 MB
- named_ent_dtest.treex4 MB
- named_ent_etest.treex4 MB
- xml
- tools
- doc
- techrep-ne-2007.pdf600 kB
- doc.pdf151 kB
- statistics.txt923 B
- ne-type-hierarchy.pdf54 kB

