Bavaria's Dialects Online (BDO) is the digital language information system of the three projects "Bavarian Dictionary", "Franconian Dictionary", and "Dialectological Information System of Bavarian Swabia". The database combines the research results of dialect research and presents dictionary articles as well as research data in a freely accessible online tool.
BDO is not only aimed at scholars, but also at the lay public interested in the language. Here, the vocabulary of all Bavarian dialects is collected in one place and made accessible. The system shows the richness of the dialects of Bavaria in combination. With the new database, one will be able to compare the dialect vocabulary of Old Bavaria, Franconia and Swabia. Authentic dialect evidence is used to illustrate the dialect words in their variety of meanings and regional distribution, as well as to show their use in idioms, proverbs, and much more. BDO allows a whole new look at the vocabulary of the dialects of all parts of the state of Bavaria.
This resource is an Italian morphological dictionary for content words, encoded in a JSON Lines format text file. It contains correspondences between surface form and lexical forms of words followed by grammatical features. The surface word forms have been generated algorithmically by using stable phonological and morphological rules of the Italian language. Particular attention has been given to the generation of verbs for which rules have been extracted from the famous A.L e G. Lepschy, La lingua italiana. The dictionary with its remarkable coverage is particularly useful used together with the Italian Function Words (http://hdl.handle.net/11372/LRT-2288) for tasks such as POS-Tagging or Syntactic Parsing.
This resource is the second version of an Italian morphological dictionary for content words, encoded in a JSON Lines format text file. It contains correspondences between surface form and lexical forms of words followed by standard grammatical properties. Compared to the first release, this version has a better JSON structure. The surface word forms have been generated algorithmically by using stable phonological and morphological rules of the Italian language. Particular attention has been given to the generation of verbs for which rules have been extracted from A.L e G. Lepschy, La Lingua Italiana. The dictionary with its remarkable coverage is particularly useful used together with the Italian Function Words v2 (http://hdl.handle.net/11372/LRT-2629) for tasks such as pos-tagging or syntactic parsing.
This resource is the third version of the Italian morphological dictionary for content words (http://hdl.handle.net/11372/LRT-2630), encoded in a JSON Lines format. Compared to the previous version, it contains some minor improvements.
This dictionary is a curated list of Italian function words in a JSON Lines format text file, particularly useful for tasks such as POS-Tagging or Syntactic Parsing. It contains 999 single-word forms and 2501 multi-words forms. Each entry may have the following grammatical features: lemma, pos, mood, tense, person, number, gender, case, degree.
This dictionary is the second version of 11372/LRT-2288, a curated list of Italian function words in a JSON Lines format text file, particularly useful for tasks such as POS-Tagging or Syntactic Parsing. It contains 999 single-word forms and 2501 multi-words forms. Each entry may have the following grammatical features: lemma, pos, mood, tense, person, number, gender, case, degree. Compared to the first release, this version has a more clear JSON structure.
This dictionary is the third version of 11372/LRT-2288, a curated list of Italian function words in a JSON Lines format text file, particularly useful for tasks such as part of speech tagging or syntactic parsing. Compared to the previous release, this version includes some minor improvements.
NomVallex 2.0 is a manually annotated valency lexicon of Czech nouns and adjectives, created in the theoretical framework of the Functional Generative Description and based on corpus data (the SYN series of corpora from the Czech National Corpus and the Araneum Bohemicum Maximum corpus). In total, NomVallex is comprised of 1027 lexical units contained in 570 lexemes, covering the following parts-of-speech and derivational categories: deverbal or deadjectival nouns, and deverbal, denominal, deadjectival or primary adjectives. Valency properties of a lexical unit are captured in a valency frame (modeled as a sequence of valency slots, each supplemented with a list of morphemic forms) and documented by corpus examples. In order to make it possible to study the relationship between valency behavior of base words and their derivatives, lexical units of nouns and adjectives in NomVallex are linked to their respective base lexical units (contained either in NomVallex itself or, in case of verbs, in the VALLEX lexicon), linking up to three parts-of-speech (i.e., noun – verb, adjective – verb, noun – adjective, and noun – adjective – verb).
In order to facilitate comparison, this submission also contains abbreviated entries of the base verbs of these nouns and adjectives from the VALLEX lexicon and simplified entries of the covered nouns and adjectives from the PDT-Vallex lexicon.
The NomVallex I. lexicon describes valency of Czech deverbal nouns belonging to three semantic classes, i.e. Communication (dotaz 'question'), Mental Action (plán 'plan') and Psych State (nenávist 'hatred'). It covers both stem-nominals and root-nominals (dotazování se 'asking' and dotaz 'question'). In total, the lexicon includes 505 lexical units in 248 lexemes. Valency properties are captured in the form of valency frames, specifying valency slots and their morphemic forms, and are exemplified by corpus examples.
In order to facilitate comparison, this submission also contains abbreviated entries of the source verbs of these nouns from the Vallex lexicon and simplified entries of the covered nouns from the PDT-Vallex lexicon.
ParaDi 2.0. is a dictionary of single verb paraphrases of Czech verbal multiword expressions - light verb constructions and idiomatic verb constructions. Moreover, it provides an elaborated set of morphological, syntactic and semantic features, including information on aspectual counterparts of verbs or paraphrasability conditions of given verbs.
The format of ParaDi has been designed with respect to both human and machine readability - the dictionary is represented as a plain table in TSV format, as it is a flexible and language-independent data format.