Zobrazit minimální záznam
dc.contributor.author |
Šebesta, Karel |
dc.contributor.author |
Bedřichová, Zuzanna |
dc.contributor.author |
Šormová, Kateřina |
dc.contributor.author |
Štindlová, Barbora |
dc.contributor.author |
Hrdlička, Milan |
dc.contributor.author |
Hrdličková, Tereza |
dc.contributor.author |
Hana, Jiří |
dc.contributor.author |
Petkevič, Vladimír |
dc.contributor.author |
Jelínek, Tomáš |
dc.contributor.author |
Škodová, Svatava |
dc.contributor.author |
Janeš, Petr |
dc.contributor.author |
Lundáková, Kateřina |
dc.contributor.author |
Skoumalová, Hana |
dc.contributor.author |
Sládek, Šimon |
dc.contributor.author |
Pierscieniak, Piotr |
dc.contributor.author |
Toufarová, Dagmar |
dc.contributor.author |
Straka, Milan |
dc.contributor.author |
Rosen, Alexandr |
dc.contributor.author |
Náplava, Jakub |
dc.contributor.author |
Poláčková, Marie |
dc.date.accessioned |
2017-05-03T08:08:33Z |
dc.date.available |
2017-05-03T08:08:33Z |
dc.date.issued |
2017-04-30 |
dc.identifier.uri |
http://hdl.handle.net/11234/1-2143 |
dc.description |
CzeSL-GEC is a corpus containing sentence pairs of original and corrected versions of Czech sentences collected from essays written by both non-native learners of Czech and Czech pupils with Romani background. To create this corpus, unreleased CzeSL-man corpus (http://utkl.ff.cuni.cz/learncorp/) was utilized. All sentences in the corpus are word tokenized. |
dc.language.iso |
ces |
dc.publisher |
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.relation.isreplacedby |
http://hdl.handle.net/11234/1-3057 |
dc.rights |
Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) |
dc.rights.uri |
http://creativecommons.org/licenses/by-sa/3.0/ |
dc.subject |
natural language correction |
dc.subject |
grammatical error correction |
dc.title |
CzeSL Grammatical Error Correction Dataset (CzeSL-GEC) |
dc.type |
corpus |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
dc.rights.label |
PUB |
has.files |
yes |
branding |
LINDAT / CLARIAH-CZ |
contact.person |
Milan Straka straka@ufal.mff.cuni.cz Charles University, UFAL |
contact.person |
Jakub Náplava naplava@ufal.mff.cuni.cz Charles University, UFAL |
sponsor |
Ministerstvo školství, mládeže a tělovýchovy České republiky LM2015071 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds |
sponsor |
Grantová agentura České republiky GAČR 16-10185S Čeština nerodilých mluvčích z pohledu teoretického a komputačního / Non-native Czech from the Theoretical and Computational Perspective nationalFunds |
size.info |
108067 sentences |
size.info |
48 files |
files.size |
5326473 |
files.count |
1 |
Soubory tohoto záznamu
- Název
- 2017-czesl-gec.zip
- Velikost
- 5.08
MB
- Formát
- application/zip
- Popis
- corpus data and metadata, zipped
- MD5
- 49dba121e7bf8deb180e673693410cc9
Stáhnout soubor
Náhled
- word2simword
- a1_targets_train.txt524 kB
- a1_targets_test.txt35 kB
- a2_targets_train.txt326 kB
- a1_targets_dev.txt33 kB
- a2_inputs_train.txt324 kB
- a1_inputs_test.txt34 kB
- a2_targets_dev.txt33 kB
- a1_inputs_dev.txt32 kB
- a2_inputs_test.txt34 kB
- a2_inputs_dev.txt32 kB
- a1_inputs_train.txt521 kB
- a2_targets_test.txt35 kB
- word2words
- a1_targets_train.txt1 MB
- a1_targets_test.txt73 kB
- a2_targets_train.txt667 kB
- a1_targets_dev.txt70 kB
- a2_inputs_train.txt647 kB
- a1_inputs_test.txt71 kB
- a2_targets_dev.txt70 kB
- a1_inputs_dev.txt69 kB
- a2_inputs_test.txt71 kB
- a2_inputs_dev.txt69 kB
- a1_inputs_train.txt1 MB
- a2_targets_test.txt73 kB
- word2word
- a1_targets_train.txt598 kB
- a1_targets_test.txt37 kB
- a2_targets_train.txt368 kB
- a1_targets_dev.txt38 kB
- a2_inputs_train.txt366 kB
- a1_inputs_test.txt37 kB
- a2_targets_dev.txt38 kB
- a1_inputs_dev.txt38 kB
- a2_inputs_test.txt37 kB
- a2_inputs_dev.txt38 kB
- a1_inputs_train.txt593 kB
- a2_targets_test.txt37 kB
- sent2sent
- a1_targets_train.txt1 MB
- a1_targets_test.txt79 kB
- a2_targets_train.txt653 kB
- a1_targets_dev.txt71 kB
- a2_inputs_train.txt638 kB
- a1_inputs_test.txt78 kB
- a2_targets_dev.txt71 kB
- a1_inputs_dev.txt70 kB
- a2_inputs_test.txt78 kB
- a2_inputs_dev.txt70 kB
- a1_inputs_train.txt1 MB
- a2_targets_test.txt79 kB
- README.md2 kB
- LICENSE.txt21 kB
Zobrazit minimální záznam