Synthetic part of CzEng 2.0
Please use the following text to cite this item or export to a predefined format:
Popel, Martin, 2020,
Synthetic part of CzEng 2.0, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-4774.
Authors
Item identifier
Project URL
Referenced by
Date issued
2020-07-06
Size
131537252 sentences
Description
CzEng is a sentence-parallel Czech-English corpus compiled at the Institute of Formal and Applied Linguistics (ÚFAL). While the full CzEng 2.0 is freely available for non-commercial research purposes from the project website (https://ufal.mff.cuni.cz/czeng), this release contains only the original monolingual parts of news text (csmono 53M and enmono 79M sentences) with automatic (synthetic) translations by CUBBITT.
See the attached README for additional details such as the file format.
Acknowledgement
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2018101
Project name:LINDAT/CLARIAH-CZ: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2015071
Project name:LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat
Grantová agentura České republiky
Project code:GX20-16819X
Project name:LUSyD – Language Understanding: from Syntax to Discourse
Subject(s)
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- README
- Size
- 2.99 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- ab2d71950b2e51acdc461bec8674b164

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- czeng20-csmono.gz
- Size
- 4.31 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b80333bef7cc9db8610daaae0e2186ea

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- czeng20-enmono.gz
- Size
- 7.61 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- bf5941d6de35af9cbd7f0f0efd190e1f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

