dc.contributor.author | Tamchyna, Aleš |
dc.contributor.author | Bojar, Ondřej |
dc.date.accessioned | 2015-11-26T11:22:41Z |
dc.date.available | 2015-11-26T11:22:41Z |
dc.date.issued | 2015 |
dc.identifier.uri | http://hdl.handle.net/11234/1-1581 |
dc.description | AMALACH project component TMODS:ENG-CZE; machine translation of queries from Czech to English. This archive contains models for the Moses decoder (binarized, pruned to allow for real-time translation) and configuration files for the MTMonkey toolkit. The aim of this package is to provide a full service for Czech->English translation which can be easily utilized as a component in a larger software solution. (The required tools are freely available and an installation guide is included in the package.) The translation models were trained on CzEng 1.0 corpus and Europarl. Monolingual data for LM estimation additionally contains WMT news crawls until 2013. |
dc.language.iso | ces |
dc.language.iso | eng |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.rights | Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ |
dc.source.uri | http://ufal.mff.cuni.cz/grants/amalach |
dc.subject | machine translation |
dc.subject | query translationn |
dc.title | TMODS:ENG-CZE -- query translation |
dc.type | toolService |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent | true |
metashare.ResourceInfo#ContentInfo.detailedType | suiteOfTools |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Aleš Tamchyna tamchyna@ufal.mff.cuni.cz Charles University in Prague, UFAL |
sponsor | Ministerstvo kultury České republiky DF12P01OVV022 Zpřístupnění rozsáhlého video archivu kulturního dědictví pomocí metod automatického rozpoznávání mluvené řeči a strojového překladu. (AMALACH) nationalFunds |
files.size | 2942964173 |
files.count | 2 |
Files in this item
This item is
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
- Name
- TMODS-ENG-CZE-query.tar.gz
- Size
- 2.74 GB
- Format
- application/x-gzip
- Description
- Model files for TMODS:ENG-CZE query translation
- MD5
- 41246e0b67c36c6b872c8d4bc9281e36
- TMODS-ENG-CZE-query
- README.txt1 kB
- model
- biglm.trie1 GB
- worker.cfg175 B
- moses.ini844 B
- appserver.cfg120 B
- ttable.lemmas.minphr933 MB
- smalllm.trie250 MB
- ttable.minphr975 MB
- Name
- README.txt
- Size
- 1.28 KB
- Format
- Text file
- MD5
- 92d655ca32f3f8ec76d77d3c3d3f27b0
TMODS:ENG-CZE -- query translation component ============================================ Installation: 1) Download and compile the Moses decoder: (Requires Boost, libXML and CMPH.) git clone https://github.com/moses-smt/mosesdecoder.git cd mosesdecoder ./bjam --with-xmlrpc=<path-to-libxml> --max-kenlm-order=12 --with-cmph=<path-to-cmph-library> cd .. 2) Download and configure MTMonkey: (See MTMonkey documentation for installation requirements.) git clone https://github.com/ufal/mtmonkey cd mtmonkey git checkout chimera_preprocessing # the tested commit ID is d0e3175ee112a3fdd4790ccd1b0ff4e5d90c8d04 cd .. 3) Install MorphoDita and its Python bindings. Follow the steps described here: http://ufal.mff.cuni.cz/morphodita/install#python_installation 4) Download Morphodita model for Czech: http://hdl.handle.net/11858/00-097C-0000-0023-68D8-1 Copy the model in the MTMonkey directory: cp czech-morfflex-pdt-131112.tagger-fast mtmonkey 5) Start t . . .