dc.contributor.author | Šebesta, Karel |
dc.contributor.author | Bedřichová, Zuzanna |
dc.contributor.author | Šormová, Kateřina |
dc.contributor.author | Štindlová, Barbora |
dc.contributor.author | Hrdlička, Milan |
dc.contributor.author | Hrdličková, Tereza |
dc.contributor.author | Hana, Jiří |
dc.contributor.author | Petkevič, Vladimír |
dc.contributor.author | Jelínek, Tomáš |
dc.contributor.author | Škodová, Svatava |
dc.contributor.author | Janeš, Petr |
dc.contributor.author | Lundáková, Kateřina |
dc.contributor.author | Skoumalová, Hana |
dc.contributor.author | Sládek, Šimon |
dc.contributor.author | Pierscieniak, Piotr |
dc.contributor.author | Toufarová, Dagmar |
dc.contributor.author | Straka, Milan |
dc.contributor.author | Rosen, Alexandr |
dc.contributor.author | Náplava, Jakub |
dc.contributor.author | Poláčková, Marie |
dc.date.accessioned | 2017-05-03T08:08:33Z |
dc.date.available | 2017-05-03T08:08:33Z |
dc.date.issued | 2017-04-30 |
dc.identifier.uri | http://hdl.handle.net/11234/1-2143 |
dc.description | CzeSL-GEC is a corpus containing sentence pairs of original and corrected versions of Czech sentences collected from essays written by both non-native learners of Czech and Czech pupils with Romani background. To create this corpus, unreleased CzeSL-man corpus (http://utkl.ff.cuni.cz/learncorp/) was utilized. All sentences in the corpus are word tokenized. |
dc.language.iso | ces |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.relation.isreplacedby | http://hdl.handle.net/11234/1-3057 |
dc.rights | Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-sa/3.0/ |
dc.subject | natural language correction |
dc.subject | grammatical error correction |
dc.title | CzeSL Grammatical Error Correction Dataset (CzeSL-GEC) |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Milan Straka straka@ufal.mff.cuni.cz Charles University, UFAL |
contact.person | Jakub Náplava naplava@ufal.mff.cuni.cz Charles University, UFAL |
sponsor | Ministerstvo školství, mládeže a tělovýchovy České republiky LM2015071 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds |
sponsor | Grantová agentura České republiky GAČR 16-10185S Čeština nerodilých mluvčích z pohledu teoretického a komputačního / Non-native Czech from the Theoretical and Computational Perspective nationalFunds |
size.info | 108067 sentences |
size.info | 48 files |
files.size | 5326473 |
files.count | 1 |
Files in this item
This item is
Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
Publicly Available
and licensed under:Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
- Name
- 2017-czesl-gec.zip
- Size
- 5.08 MB
- Format
- application/zip
- Description
- corpus data and metadata, zipped
- MD5
- 49dba121e7bf8deb180e673693410cc9
- word2simword
- a1_targets_train.txt524 kB
- a1_targets_test.txt35 kB
- a2_targets_train.txt326 kB
- a1_targets_dev.txt33 kB
- a2_inputs_train.txt324 kB
- a1_inputs_test.txt34 kB
- a2_targets_dev.txt33 kB
- a1_inputs_dev.txt32 kB
- a2_inputs_test.txt34 kB
- a2_inputs_dev.txt32 kB
- a1_inputs_train.txt521 kB
- a2_targets_test.txt35 kB
- word2words
- a1_targets_train.txt1 MB
- a1_targets_test.txt73 kB
- a2_targets_train.txt667 kB
- a1_targets_dev.txt70 kB
- a2_inputs_train.txt647 kB
- a1_inputs_test.txt71 kB
- a2_targets_dev.txt70 kB
- a1_inputs_dev.txt69 kB
- a2_inputs_test.txt71 kB
- a2_inputs_dev.txt69 kB
- a1_inputs_train.txt1 MB
- a2_targets_test.txt73 kB
- word2word
- a1_targets_train.txt598 kB
- a1_targets_test.txt37 kB
- a2_targets_train.txt368 kB
- a1_targets_dev.txt38 kB
- a2_inputs_train.txt366 kB
- a1_inputs_test.txt37 kB
- a2_targets_dev.txt38 kB
- a1_inputs_dev.txt38 kB
- a2_inputs_test.txt37 kB
- a2_inputs_dev.txt38 kB
- a1_inputs_train.txt593 kB
- a2_targets_test.txt37 kB
- sent2sent
- a1_targets_train.txt1 MB
- a1_targets_test.txt79 kB
- a2_targets_train.txt653 kB
- a1_targets_dev.txt71 kB
- a2_inputs_train.txt638 kB
- a1_inputs_test.txt78 kB
- a2_targets_dev.txt71 kB
- a1_inputs_dev.txt70 kB
- a2_inputs_test.txt78 kB
- a2_inputs_dev.txt70 kB
- a1_inputs_train.txt1 MB
- a2_targets_test.txt79 kB
- README.md2 kB
- LICENSE.txt21 kB