TMODS:ENG-CZE -- query translation
Please use the following text to cite this item or export to a predefined format:
Tamchyna, Aleš and Bojar, Ondřej, 2015,
TMODS:ENG-CZE -- query translation, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-1581.
Authors
Item identifier
Project URL
Date issued
2015
Type
Description
AMALACH project component TMODS:ENG-CZE; machine translation of queries from Czech to English. This archive contains models for the Moses decoder (binarized, pruned to allow for real-time translation) and configuration files for the MTMonkey toolkit. The aim of this package is to provide a full service for Czech->English translation which can be easily utilized as a component in a larger software solution. (The required tools are freely available and an installation guide is included in the package.)
The translation models were trained on CzEng 1.0 corpus and Europarl. Monolingual data for LM estimation additionally contains WMT news crawls until 2013.
Acknowledgement
Ministerstvo kultury České republiky
Project code:DF12P01OVV022
Project name:Zpřístupnění rozsáhlého video archivu kulturního dědictví pomocí metod automatického rozpoznávání mluvené řeči a strojového překladu. (AMALACH)
Subject(s)
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- TMODS-ENG-CZE-query.tar.gz
- Size
- 2.74 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 41246e0b67c36c6b872c8d4bc9281e36

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- README.txt
- Size
- 1.28 KB
- Format
- text/plain
- Description
- Text
- MD5
- 92d655ca32f3f8ec76d77d3c3d3f27b0

TMODS:ENG-CZE -- query translation component ============================================ Installation: 1) Download and compile the Moses decoder: (Requires Boost, libXML and CMPH.) git clone https://github.com/moses-smt/mosesdecoder.git cd mosesdecoder ./bjam --with-xmlrpc=<path-to-libxml> --max-kenlm-order=12 --with-cmph=<path-to-cmph-library> cd .. 2) Download and configure MTMonkey: (See MTMonkey documentation for installation requirements.) git clone https://github.com/ufal/mtmonkey cd mtmonkey git checkout chimera_preprocessing # the tested commit ID is d0e3175ee112a3fdd4790ccd1b0ff4e5d90c8d04 cd .. 3) Install MorphoDita and its Python bindings. Follow the steps described here: http://ufal.mff.cuni.cz/morphodita/install#python_installation 4) Download Morphodita model for Czech: http://hdl.handle.net/11858/00-097C-0000-0023-68D8-1 Copy the model in the MTMonkey directory: cp czech-morfflex-pdt-131112.tagger-fast mtmonkey 5) Start the Moses server: cd model nohup ../mosesdecoder/bin/server -f moses.ini &>mosesserver.log & 6) Start the MTMonkey application server and worker: nohup ../mtmonkey/appserver/src/appserver.py -c `pwd`/appserver.cfg &> appserver.log & nohup ../mtmonkey/worker/src/worker.py -c worker.cfg &> worker.log &

