Nově přidané

 toolService 
toolService
Popis:
Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging ...
 Tento záznam neobsahuje soubory.
 corpus 
corpus
Popis:
The corpus presented consists of job ads in Spanish related to Engineering positions in Peru. The documents were preprocessed and annotated for POS tagging, NER, and topic modeling tasks. The corpus is divided in two ...
 Tento záznam obsahuje 1 soubor (10.99 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 corpus 
corpus
Popis:
Corpus of texts in 12 languages. For each language, we provide one training, one development and one testing set acquired from Wikipedia articles. Moreover, each language dataset contains (substantially larger) training ...
 Tento záznam obsahuje 13 souborů (17.37 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike

Nejnavštěvovanější záznamy

Za poslední týden
 languageDescription 
languageDescription
Popis:
Automatic segmentation, tokenization and morphological and syntactic annotations of raw texts in 45 languages, generated by UDPipe (http://ufal.mff.cuni.cz/udpipe), together with word embeddings of dimension 100 computed ...
 Tento záznam obsahuje 46 souborů (629.66 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike
 corpus 
corpus
Popis:
A slightly modified version of the Czech Wordnet. This is the version used to annotate "The Lexico-Semantic Annotation of PDT using Czech WordNet": http://hdl.handle.net/11858/00-097C-0000-0001-487A-4 The Czech WordNet ...
 Tento záznam obsahuje 1 soubor (440.85 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike
 corpus 
corpus
Popis:
Additional three Czech reference translations of the whole WMT 2011 data set (http://www.statmt.org/wmt11/test.tgz), translated from the German originals. Original segmentation of the WMT 2011 data is preserved.
 Tento záznam obsahuje 1 soubor (527.44 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike