Nově přidané
lexicalConceptualResource

Popis:
The valency lexicon PDT-Vallex 4.0 has been built in close connection with the annotation of the Prague Dependency Treebank project (PDT) and its successors (mainly the Prague Czech-English Dependency Treebank project, ...
Tento záznam obsahuje 1 soubor (1.61
MB).
Publicly Available




corpus

Popis:
The Sequoia corpus is a set of 3,099 linguistically-annotated French sentences, originating from four sources (Europarl, European Agency Reports, French regional journal L'Est Républicain, and French wikipedia).
Several ...
Tento záznam obsahuje 1 soubor (4.37
MB).
Publicly Available


lexicalConceptualResource

Popis:
MorfFlex CZ 2.0 is the Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. MorfFlex is a flat list of lemma-tag-wordform triples. For each wordform, full ...
Tento záznam obsahuje 1 soubor (234.84
MB).
Publicly Available




Nejnavštěvovanější záznamy
Za poslední týden
corpus

Popis:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and ...
Tento záznam obsahuje 3 souborů (479.93
MB).
Publicly Available


languageDescription

Popis:
Automatic segmentation, tokenization and morphological and syntactic annotations of raw texts in 45 languages, generated by UDPipe (http://ufal.mff.cuni.cz/udpipe), together with word embeddings of dimension 100 computed ...
Tento záznam obsahuje 47 souborů (629.67
GB).
Publicly Available




toolService

Popis:
Tokenizer, POS Tagger, Lemmatizer and Parser models for 94 treebanks of 61 languages of Universal Depenencies 2.5 Treebanks, created solely using UD 2.5 data (http://hdl.handle.net/11234/1-3105). The model documentation ...
Tento záznam obsahuje 96 souborů (2.61
GB).
Publicly Available



