Nově přidané
corpus

Popis:
Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3424). It contains additional deep-syntactic and semantic annotations. Version ...
Tento záznam obsahuje 1 soubor (1.09
GB).
Publicly Available


corpus

Popis:
The database contains annotated reflective sentences, which fall into the categories of reflective writing according to Ullmann's (2019) model. The dataset is ready to replicate these categories' prediction using machine ...
Tento záznam obsahuje 2 souborů (2.97
MB).
Publicly Available


corpus

Popis:
Data
-------
Malayalam Visual Genome (MVG for short) 1.0 has similar goals as Hindi Visual Genome (HVG) 1.1: to support the Malayalam language. Malayalam Visual Genome 1.0 is the first multi-modal dataset in Malayalam ...
Tento záznam obsahuje 6 souborů (4.33
GB).
Publicly Available




Nejnavštěvovanější záznamy
Za poslední týden
corpus

Popis:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and ...
Tento záznam obsahuje 3 souborů (479.93
MB).
Publicly Available


languageDescription

Popis:
Automatic segmentation, tokenization and morphological and syntactic annotations of raw texts in 45 languages, generated by UDPipe (http://ufal.mff.cuni.cz/udpipe), together with word embeddings of dimension 100 computed ...
Tento záznam obsahuje 47 souborů (629.67
GB).
Publicly Available




toolService

Popis:
Tokenizer, POS Tagger, Lemmatizer and Parser models for 94 treebanks of 61 languages of Universal Depenencies 2.5 Treebanks, created solely using UD 2.5 data (http://hdl.handle.net/11234/1-3105). The model documentation ...
Tento záznam obsahuje 96 souborů (2.61
GB).
Publicly Available



