Type: toolService - LINDAT/CLARIAH-CZ Catalog Search Results

241. Tools for Catalan and Spanish corpus processing

Publisher:: IULA, Universitat Pompeu Fabra
Type:: toolService
Subject:: corpus processing
Description:: A package of tools for Catalan and Spanish corpus processing. It includes a text handling module and a probabilistic POS tagger. It also allows consulting POS tagger dictionary data.
Rights:: Not specified

242. Totoli corpus

Publisher:: Sprachwissenschaftliches Institut, Universität Bochum
Type:: toolService
Description:: Documentation of the Totoli project (DoBeS project)
Rights:: Code of conduct

243. Translation Equivalents Extractor

Creator:: Tufiş, Dan, Ion, Radu, and Barbu, Ana-Maria
Publisher:: Research Institute for Artificial Intelligence, Romanian Academy of Sciences
Type:: toolService
Description:: TREQ exploits the knowledge embedded in the parallel corpora and produces a set of translation equivalents (a translation lexicon), based on a 1:1 mapping hypothesis. The program uses almost no linguistic knowledge, relying on statistical evidence and some simplifying assumptions. The extraction process is based on a testing approach. It generates first a list of translation equivalent candidates and then successively extracts the most likely translation equivalence pairs. It does not require a pre-existing bilingual lexicon for the considered languages. Yet, if such a lexicon exists, it can be used to eliminate spurious candidate translation equivalence pairs and thus to speed up the process and increase its accuracy. The algorithm relies on some pre-processing of the bitext: sentence aligner, tokeniser (using [[(http://www.lpl.univaix.fr/projects/multext/MtSeg|MtSeg]]), a collocation extractor (unaware of translation equivalence), POS-tagger, lemmatiser. More detailed descriptions are available in the following paper (http://www.racai.ro/~tufis/papers/): -- Dan Tufiş and Ana-Maria Barbu (2002). Revealing translators knowledge: statistical methods in constructing practical translation lexicons for language and speech processing. In International Journal of Speech Technology, volume 5, pp. 199-209. Kluwer Academic Publishers, November 2002. ISSN 1381-2416. -- Dan Tufiş (2002). A cheap and fast way to build useful translation lexicons. In Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), pp. 1030-1036, Taipei, Taiwan, August 2002. ISBN 1-55860-894. -- Dan Tufiş and Ana Maria Barbu (2001). Automatic Construction of Translation Lexicons. In V.V.Kluew, C.E. D'Attellis, and N.E. Mastorakis (eds.), Advances in Automation, Multimedia and Video Systems, and Modern Computer Science, pp. 156-161. WSES Press, December 2001. ISSN 1790-5117. -- Dan Tufiş and Ana Maria Barbu (2001). Extracting Multilingual Lexicons from Parallel Corpora. In Proceedings of the ACH-ALLC conference (ACH-ALLC 2001), New York, USA, June 2001. -- Dan Tufiş and Ana Maria Barbu (2001). Accurate Automatic Extraction of Translation Equivalents from Parallel Corpora. In Paul Rayson, Andrew Wilson, Tony McEnery, Andrew Hardie, and Shereen Khoja., editors, Proceedings of the Corpus Linguistics 2001 Conference (CL 2001), pp. 581-586, Lancaster, UK, March 2001. Lancaster University, Computing Department. ISBN 1-86220-107-2.
Rights:: Not specified

244. Translation Models (en-de) (v1.0)

Creator:: Variš, Dušan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: machine translation, neural machine translation, and transformer
Language:: English and German
Description:: En-De translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on newstest2020 (BLEU): en->de: 25.9 de->en: 33.4 (Evaluated using multeval: https://github.com/jhclark/multeval)
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

245. Translation Models (en-ru) (v1.0)

Creator:: Variš, Dušan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: machine translation, neural machine translation, and transformer
Language:: English and Russian
Description:: En-Ru translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on newstest2020 (BLEU): en->ru: 18.0 ru->en: 30.4 (Evaluated using multeval: https://github.com/jhclark/multeval)
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

246. Translog 2006

Publisher:: Copenhagen Business School
Type:: toolService
Description:: Translog 2006 is the leading tool for analysing human text production processes. It was originally designed for translation process research, but can be used for a variety of personal learning, teaching, and research purposes.
Rights:: Not specified

247. TrEd

Creator:: Pajas, Petr
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: toolService
Subject:: annotation, tree, editor, XML, and PML
Description:: Tree Editor TrEd is a fully customizable and programmable graphical editor and viewer for tree-like structures. Among other projects, it was used as the main annotation tool for syntactical and tectogrammatical annotations in The Prague Dependency Treebank, as well as for decision-tree based morphological annotation of The Prague Arabic Dependency Treebank.
Rights:: GNU General Public License, version 2, http://www.gnu.org/licenses/gpl-2.0.html, and PUB

248. TrEdVoice

Publisher:: University of Western Bohemia, Pilsen and Charles University
Type:: toolService
Description:: The TrEdVoice module is designed to be TrEd annotation editor accessories enabling the voice control of its functions.
Rights:: Not specified

249. TreeTagger

Publisher:: University of Stuttgart
Type:: toolService
Subject:: POS tagger
Language:: Bulgarian, Dutch, English, French, German, Modern Greek (1453-), Italian, Portuguese, Russian, Spanish, and Swahili (macrolanguage)
Description:: A part-of-speech tagger and lemmatizer for several languages.
Rights:: Not specified

250. Treex::Web

Creator:: Sedlák, Michal
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: toolService and service
Subject:: Treex, Perl, REST, web service, and machine translation
Language:: English and Czech
Description:: Treex::Web is a web frontend for running Treex applications from your browser. Treex (formerly TectoMT) is a highly modular NLP framework implemented in Perl programming language. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project.
Rights:: Not specified

241. Tools for Catalan and Spanish corpus processing

242. Totoli corpus

243. Translation Equivalents Extractor

244. Translation Models (en-de) (v1.0)

245. Translation Models (en-ru) (v1.0)

246. Translog 2006

247. TrEd

248. TrEdVoice

249. TreeTagger

250. Treex::Web

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Date

Original context has metadata only

Harvested from