This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

WMT16 APE Shared Task Data

Please use the following text to cite this item or export to a predefined format:
Turchi, Marco; Chatterjee, Rajen and Negri, Matteo, 2016, WMT16 APE Shared Task Data, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11372/LRT-1632.
Date issued
2016-02-21
Size
1294 kb
Language(s)
Description
Training, development and text data (the same used for the Sentence-level Quality Estimation task) consist in English-German triplets (source, target and post-edit) belonging to the IT domain and already tokenized. Training and development respectively contain 12,000 and 1,000 triplets, while the test set 2,000 instances. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Acknowledgement
This item isPublicly Available
and licensed under:
 Files in this item
Name
test_pe.zip
Size
65.46 KB
Format
application/zip
Description
Gold standard sentences
MD5
574b990a048e714014254c8a7d8c960d
Preview
  File Preview
Name
TrainDev.zip
Size
1.26 MB
Format
application/zip
Description
Training and development data, WMT16 APE Task
MD5
720a9ccb8b790aa87fafbb7a656ecd4f
Preview
  File Preview
Name
Test.zip
Size
117.19 KB
Format
application/zip
Description
Test data, WMT16 APE Task
MD5
5e6ee45a3b3418b75ff6fa90fb780016
Preview
  File Preview