« Previous |
1 - 10 of 18
|
Next »
Number of results to display per page
Search Results
2. CoNLL 2017 Shared Task - UDPipe Baseline Models and Supplementary Materials
- Creator:
- Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text, mlmodel, and languageDescription
- Subject:
- CoNLL 2017, tokenizer, POS tagger, lemmatization, tagger, parser, dependency parser, morphology, and treebank
- Language:
- Multiple languages
- Description:
- Baseline UDPipe models for CoNLL 2017 Shared Task in UD Parsing, and supplementary material. The models require UDPipe version at least 1.1 and are evaluated using the official evaluation script. The models are trained on a slightly different split of the official UD 2.0 CoNLL 2017 training data, so called baselinemodel split, in order to allow comparison of models even during the shared task. This baselinemodel split of UD 2.0 CoNLL 2017 training data is available for download. Furthermore, we also provide UD 2.0 CoNLL 2017 training data with automatically predicted morphology. We utilize the baseline models on development data and perform 10-fold jack-knifing (each fold is predicted with a model trained on the rest of the folds) on the training data. Finally, we supply all required data and hyperparameter values needed to replicate the baseline models.
- Rights:
- Licence Universal Dependencies v2.0, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.0, and PUB
3. CoNLL 2018 Shared Task - UDPipe Baseline Models and Supplementary Materials
- Creator:
- Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text, mlmodel, and languageDescription
- Subject:
- CoNLL 2018, tokenizer, POS tagger, lemmatization, tagger, parser, dependency parser, morphology, and treebank
- Language:
- Multiple languages
- Description:
- Baseline UDPipe models for CoNLL 2018 Shared Task in UD Parsing, and supplementary material. The models require UDPipe version at least 1.2 and are evaluated using the official evaluation script. The models were trained using a custom data split for treebanks where no development data is provided. Also, we trained an additional "Mixed" model, which uses 200 sentences from every training data. All information needed to replicate the model training (hyperparameters, modified train-dev split, and pre-computed word embeddings for the parser) are included in the archive. Additionaly, we provide UD 2.2 CoNLL 2018 training data with automatically predicted morphology. We utilize the baseline models on development data and perform 10-fold jack-knifing (each fold is predicted with a model trained on the rest of the folds) on the training data.
- Rights:
- Licence Universal Dependencies v2.2, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.2, and PUB
4. Czech PDT-C 1.0 Model for UDPipe 2 (2023-11-16)
- Creator:
- Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- tokenizer, POS tagger, lemmatization, parser, dependency parser, MorfFlex CZ 2.0, and PDT-C 1.0
- Language:
- Czech
- Description:
- Tokenizer, POS Tagger, Lemmatizer, and Parser model based on the PDT-C 1.0 treebank (https://hdl.handle.net/11234/1-3185). The model documentation including performance can be found at https://ufal.mff.cuni.cz/udpipe/2/models#czech_pdtc1.0_model . To use these models, you need UDPipe version 2.1, which you can download from https://ufal.mff.cuni.cz/udpipe/2 .
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
5. MSTperl parser
- Creator:
- Rosa, Rudolf
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- toolService and tool
- Subject:
- parser, NLP, Treex, parsing, and dependency
- Language:
- Czech and English
- Description:
- MSTperl is a Perl reimplementation of the MST parser of Ryan McDonald (http://www.seas.upenn.edu/~strctlrn/MSTParser/MSTParser.html). MST parser (Maximum Spanning Tree parser) is a state-of-the-art natural language dependency parser -- a tool that takes a sentence and returns its dependency tree. In MSTperl, only some functionality was implemented; the limitations include the following: the parser is a non-projective one, curently with no possibility of enforcing the requirement of projectivity of the parse trees; only first-order features are supported, i.e. no second-order or third-order features are possible; the implementation of MIRA is that of a single-best MIRA, with a closed-form update instead of using quadratic programming. On the other hand, the parser supports several advanced features: parallel features, i.e. enriching the parser input with word-aligned sentence in other language; adding large-scale information, i.e. the feature set enriched with features corresponding to pointwise mutual information of word pairs in a large corpus (CzEng). The MSTperl parser is tuned for parsing Czech. Trained models are available for Czech, English and German. We can train the parser for other languages on demand, or you can train it yourself -- the guidelines are part of the documentation. The parser, together with detailed documentation, is avalable on CPAN (http://search.cpan.org/~rur/Treex-Parser-MSTperl/). and The research has been supported by the EU Seventh Framework Programme under grant agreement 247762 (Faust), and by the grants GAUK116310 and GA201/09/H057.
- Rights:
- Artistic License 2.0, http://opensource.org/licenses/Artistic-2.0, and PUB
6. MSTperl parser (2015-05-19)
- Creator:
- Rosa, Rudolf
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- toolService and tool
- Subject:
- parser, NLP, Treex, parsing, and dependency
- Language:
- Czech and English
- Description:
- MSTperl is a Perl reimplementation of the MST parser of Ryan McDonald (http://www.seas.upenn.edu/~strctlrn/MSTParser/MSTParser.html). MST parser (Maximum Spanning Tree parser) is a state-of-the-art natural language dependency parser -- a tool that takes a sentence and returns its dependency tree. In MSTperl, only some functionality was implemented; the limitations include the following: the parser is a non-projective one, curently with no possibility of enforcing the requirement of projectivity of the parse trees; only first-order features are supported, i.e. no second-order or third-order features are possible; the implementation of MIRA is that of a single-best MIRA, with a closed-form update instead of using quadratic programming. On the other hand, the parser supports several advanced features: parallel features, i.e. enriching the parser input with word-aligned sentence in other language; adding large-scale information, i.e. the feature set enriched with features corresponding to pointwise mutual information of word pairs in a large corpus (CzEng); weighted/unweighted parser model interpolation; combination of several instances of the MSTperl parser (through MST algorithm); combination of several existing parses from any parsers (through MST algorithm). The MSTperl parser is tuned for parsing Czech. Trained models are available for Czech, English and German. We can train the parser for other languages on demand, or you can train it yourself -- the guidelines are part of the documentation. The parser, together with detailed documentation, is avalable on CPAN (http://search.cpan.org/~rur/Treex-Parser-MSTperl/). and The research has been supported by the EU Seventh Framework Programme under grant agreement 247762 (Faust), and by the grants GAUK116310 and GA201/09/H057.
- Rights:
- Artistic License 2.0, http://opensource.org/licenses/Artistic-2.0, and PUB
7. Parsito
- Creator:
- Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- toolService and tool
- Subject:
- parser and dependency parser
- Language:
- English
- Description:
- Parsito is a fast open-source dependency parser written in C++. Parsito is based on greedy transition-based parsing, it has very high accuracy and achieves a throughput of 30K words per second. Parsito can be trained on any input data without feature engineering, because it utilizes artificial neural network classifier. Trained models for all treebanks from Universal Dependencies project are available (37 treebanks as of Dec 2015). Parsito is a free software under Mozilla Public License 2.0 (http://www.mozilla.org/MPL/2.0/) and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA (http://creativecommons.org/licenses/by-nc-sa/4.0/) license, although for some models the original data used to create the model may impose additional licensing conditions. Parsito website http://ufal.mff.cuni.cz/parsito contains download links of both the released packages and trained models, hosts documentation and offers online demo. Parsito development repository http://github.com/ufal/parsito is hosted on GitHub.
- Rights:
- Mozilla Public License 2.0, http://opensource.org/licenses/MPL-2.0, and PUB
8. UDPipe
- Creator:
- Straka, Milan and Straková, Jana
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- tokenizer, POS tagger, tagger, lemmatization, parser, dependency parser, and CoNLL-U
- Language:
- English
- Description:
- UDPipe is an trainable pipeline for tokenization, tagging, lemmatization and dependency parsing of CoNLL-U files. UDPipe is language-agnostic and can be trained given only annotated data in CoNLL-U format. Trained models are provided for nearly all UD treebanks. UDPipe is available as a binary, as a library for C++, Python, Perl, Java, C#, and as a web service. UDPipe is a free software under Mozilla Public License 2.0 (http://www.mozilla.org/MPL/2.0/) and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA (http://creativecommons.org/licenses/by-nc-sa/4.0/) license, although for some models the original data used to create the model may impose additional licensing conditions. UDPipe is versioned using Semantic Versioning (http://semver.org/). UDPipe website http://ufal.mff.cuni.cz/udpipe contains download links of both the released packages and trained models, hosts documentation and offers online demo. UDPipe development repository http://github.com/ufal/udpipe is hosted on GitHub.
- Rights:
- Mozilla Public License 2.0, http://opensource.org/licenses/MPL-2.0, and PUB
9. UDPipe 2
- Creator:
- Straka, Milan and Straková, Jana
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- POS tagger, tagger, lemmatization, parser, dependency parser, and CoNLL-U
- Description:
- UDPipe 2 is a POS tagger, lemmatizer and dependency parser. Compared to UDPipe 1: - UDPipe 2 is Python-only and tested only in Linux, - UDPipe 2 is meant as a research tool, not as a user-friendly UDPipe 1 replacement, - UDPipe 2 achieves much better performance, but requires a GPU for reasonable performance, - UDPipe 2 does not perform tokenization by itself – it uses UDPipe 1 for that. UDPipe 2 is available in the udpipe-2 branch of the UDPipe repository at https://github.com/ufal/udpipe/tree/udpipe-2. It is a free software under Mozilla Public License 2.0 (http://www.mozilla.org/MPL/2.0/) and the models are free for non-commercial use and distributed under CC BY-NC-SA (http://creativecommons.org/licenses/by-nc-sa/4.0/) license, although for some models the original data used to create the model may impose additional licensing conditions. UDPipe 2 is also available as a REST service running at https://lindat.mff.cuni.cz/services/udpipe. If you like, you can use the https://github.com/ufal/udpipe/blob/udpipe-2/udpipe2_client.py script to interact with it.
- Rights:
- Not specified
10. Universal Dependencies 1.2 Models for Parsito
- Creator:
- Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- parser and dependency parser
- Language:
- English
- Description:
- Parsing models for all Universal Depenencies 1.2 Treebanks, created solely using UD 1.2 data (http://hdl.handle.net/11234/1-1548). To use these models, you need Parsito binary, which you can download from http://hdl.handle.net/11234/1-1584.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB