MCSQ Translation Models (en-de) (v1.0)
Please use the following text to cite this item or export to a predefined format:
Variš, Dušan, 2022,
MCSQ Translation Models (en-de) (v1.0), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-4680.
Authors
Item identifier
Date issued
2022-03-15
Type
Description
En-De translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/).
The models were trained using the MCSQ social surveys dataset (available at https://repo.clarino.uib.no/xmlui/bitstream/handle/11509/142/mcsq_v3.zip).
Their main use should be in-domain translation of social surveys.
Models are compatible with Tensor2tensor version 1.6.6.
For details about the model training (data, model hyper-parameters), please contact the archive maintainer.
Evaluation on MCSQ test set (BLEU):
en->de: 67.5 (train: genuine in-domain MCSQ data only)
de->en: 75.0 (train: additional in-domain backtranslated MCSQ data)
(Evaluated using multeval: https://github.com/jhclark/multeval)
Acknowledgement
European Union
Project code:EC/H2020/823782
Project name:SSHOC - Social Sciences & Humanities Open Cloud
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- mcsq.de-en.zip
- Size
- 663.55 MB
- Format
- application/zip
- Description
- Zip
- MD5
- b11a7d80a17127637e6862bc80e0f748

- mcsq.de-en
- vocab.ende.32768127 kB
- export
- Servo
- 1647265195
- saved_model.pbtxt7 MB
- variables
- variables.data-00000-of-00001715 MB
- variables.index10 kB
- 1647265195
- Servo
- Name
- mcsq.en-de.zip
- Size
- 671.32 MB
- Format
- application/zip
- Description
- Zip
- MD5
- f38e69751aadf0ea687b8d574b59223e

- mcsq.en-de
- vocab.ende.32768152 kB
- export
- Servo
- 1647265087
- saved_model.pbtxt7 MB
- variables
- variables.data-00000-of-00001723 MB
- variables.index10 kB
- 1647265087
- Servo

