This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 

TMODS:ENG-CZE -- query translation

Please use the following text to cite this item or export to a predefined format:
Tamchyna, Aleš and Bojar, Ondřej, 2015, TMODS:ENG-CZE -- query translation, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-1581.
Date issued
2015
Language(s)
Description
AMALACH project component TMODS:ENG-CZE; machine translation of queries from Czech to English. This archive contains models for the Moses decoder (binarized, pruned to allow for real-time translation) and configuration files for the MTMonkey toolkit. The aim of this package is to provide a full service for Czech->English translation which can be easily utilized as a component in a larger software solution. (The required tools are freely available and an installation guide is included in the package.) The translation models were trained on CzEng 1.0 corpus and Europarl. Monolingual data for LM estimation additionally contains WMT news crawls until 2013.
Acknowledgement
 Files in this item
Name
TMODS-ENG-CZE-query.tar.gz
Size
2.74 GB
Format
application/x-gzip
Description
gzip Archive
MD5
41246e0b67c36c6b872c8d4bc9281e36
Preview
  File Preview
Name
README.txt
Size
1.28 KB
Format
text/plain
Description
Text
MD5
92d655ca32f3f8ec76d77d3c3d3f27b0
Preview
  File Preview
    TMODS:ENG-CZE -- query translation component
    ============================================
    
    Installation:
    
    1) Download and compile the Moses decoder:
    
      (Requires Boost, libXML and CMPH.)
    
      git clone https://github.com/moses-smt/mosesdecoder.git
      cd mosesdecoder
      ./bjam --with-xmlrpc=<path-to-libxml> --max-kenlm-order=12 --with-cmph=<path-to-cmph-library>
      cd ..
    
    2) Download and configure MTMonkey:
      
      (See MTMonkey documentation for installation requirements.)
    
      git clone https://github.com/ufal/mtmonkey
      cd mtmonkey
      git checkout chimera_preprocessing
      # the tested commit ID is d0e3175ee112a3fdd4790ccd1b0ff4e5d90c8d04
      cd ..
    
    3) Install MorphoDita and its Python bindings.
    
      Follow the steps described here:
    
      http://ufal.mff.cuni.cz/morphodita/install#python_installation
    
    4) Download Morphodita model for Czech:
    
      http://hdl.handle.net/11858/00-097C-0000-0023-68D8-1
    
      Copy the model in the MTMonkey directory:
    
      cp czech-morfflex-pdt-131112.tagger-fast mtmonkey
    
    5) Start the Moses server:
    
      cd model
      nohup ../mosesdecoder/bin/server -f moses.ini &>mosesserver.log &
    
    6) Start the MTMonkey application server and worker:
    
      nohup ../mtmonkey/appserver/src/appserver.py -c `pwd`/appserver.cfg &> appserver.log &
      nohup ../mtmonkey/worker/src/worker.py -c worker.cfg &> worker.log &