This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

AKCES 5 (CzeSL-SGT) Release 2

Please use the following text to cite this item or export to a predefined format:
Šebesta, Karel; et al., 2014, AKCES 5 (CzeSL-SGT) Release 2, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-162.
Date issued
2014-07-27
Size
958000 words
Language(s)
Description
Essays written by non-native learners of Czech, a part of AKCES/CLAC – Czech Language Acquisition Corpora. CzeSL-SGT stands for Czech as a Second Language with Spelling, Grammar and Tags. Extends the “foreign” (ciz) part of AKCES 3 (CzeSL-plain) by texts collected in 2013. Original forms and automatic corrections are tagged, lemmatized and assigned erros labels. Most texts have metadata attributes (30 items) about the author and the text. In addition to a few minor bugs, fixes a critical issue in Release 1: the native speakers of Ukrainian (s_L1:"uk") were wrongly labelled as speakers of "other European languages" (s_L1_group="IE"), instead of speakers of a Slavic language (s_L1_group="S"). The file is now a regular XML document, with all annotation represented as XML attributes.
This item isPublicly Available
and licensed under:
 Files in this item
Name
2014-czesl-sgt-en-all-v2.zip
Size
14.94 MB
Format
application/zip
Description
corpus data and metadata, zipped
MD5
e8a81fa41fb911af47ec5e29640546ab
Preview
  File Preview
    • 2014-czesl-sgt-en-all-v2136 MB
Name
2014-czesl-sgt-en-v2.pdf
Size
104.48 KB
Format
application/pdf
Description
description of the corpus
MD5
2ec4d8f27b22b73f324814577fe097aa
Preview
  File Preview