FAUST cs-en 0.5
Please use the following text to cite this item or export to a predefined format:
Hajič, Jan; et al., 2021,
FAUST cs-en 0.5, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-3775.
Authors
Hajič, Jan ; et al.
Item identifier
Date issued
2021-09-20
Size
2223 sentences
Description
This machine translation test set contains 2223 Czech sentences collected within the FAUST project (https://ufal.mff.cuni.cz/grants/faust, http://hdl.handle.net/11234/1-3308).
Each original (noisy) sentence was normalized (clean1 and clean2) and translated to English independently by two translators.
Acknowledgement
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2018101
Project name:LINDAT/CLARIAH-CZ: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy
Grantová agentura České republiky
Project code:GX20-16819X
Project name:LUSyD – Language Understanding: from Syntax to Discourse
Subject(s)
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- faust-csen.zip
- Size
- 895.51 KB
- Format
- application/zip
- Description
- Zip
- MD5
- ddb9093027913f1883d25dfafc1ecb1a

- scripts
- faust-extract-tmx.pl1 kB
- faust-merge-tsv.pl1 kB
- original-tmx
- faust-csen-rs.tmx1 MB
- faust-csen-mu.tmx1 MB
-
- faust-csen-noisy-cs.txt160 kB
- README.txt979 B
- faust-csen-noisy-en.txt338 kB
- faust-csen-clean2-cs.txt159 kB
- faust-csen-clean1-cs.txt159 kB

