Nově přidané
toolService

Popis:
Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging ...
Tento záznam neobsahuje soubory.
corpus

Popis:
The corpus presented consists of job ads in Spanish related to Engineering positions in Peru.
The documents were preprocessed and annotated for POS tagging, NER, and topic modeling tasks.
The corpus is divided in two ...
Tento záznam obsahuje 1 soubor (10.99
MB).
Publicly Available


corpus

Popis:
Corpus of texts in 12 languages. For each language, we provide one training, one development and one testing set acquired from Wikipedia articles. Moreover, each language dataset contains (substantially larger) training ...
Tento záznam obsahuje 13 souborů (17.37
GB).
Publicly Available




Nejnavštěvovanější záznamy
Za poslední týden
languageDescription

Popis:
Automatic segmentation, tokenization and morphological and syntactic annotations of raw texts in 45 languages, generated by UDPipe (http://ufal.mff.cuni.cz/udpipe), together with word embeddings of dimension 100 computed ...
Tento záznam obsahuje 46 souborů (629.66
GB).
Publicly Available




corpus

Popis:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and ...
Tento záznam obsahuje 4 souborů (399.22
MB).
Publicly Available


corpus

Popis:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and ...
Tento záznam obsahuje 3 souborů (274.16
MB).
Publicly Available

