Number of results to display per page
Search Results
32. Frequency list: Early Modern Finnish
- Publisher:
- The Research Institute for the Languages of Finland
- Type:
- toolService
- Subject:
- word frequencies
- Language:
- Finnish
- Description:
- Frequency list of the Corpus of Early Modern Finnish, 4 862 190 words
- Rights:
- Not specified
33. Frequency list: Old Literary Finnish
- Publisher:
- The Research Institute for the Languages of Finland
- Type:
- toolService
- Language:
- Finnish
- Description:
- Frequency list of the Corpus of Old Literary Finnish, 3 425 382 words
- Rights:
- Not specified
34. Glossarium Latinitatis Medii Aevi Finlandicae /
- Creator:
- Hakamies, Reino,
- Type:
- text and slovníky
- Subject:
- Latina, slovníky, latina středověká, and jazykové slovníky
- Language:
- Finnish
- Rights:
- unknown
35. HamleDT 2.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, Stanford dependencies, Prague dependencies, harmonization, common annotation style, and Interset
- Language:
- Arabic, Bulgarian, Bengali, Catalan, Czech, Danish, German, Modern Greek (1453-), English, Spanish, Estonian, Basque, Persian, Finnish, Ancient Greek (to 1453), Hindi, Hungarian, Italian, Japanese, Latin, Dutch, Portuguese, Romanian, Russian, Slovak, Slovenian, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT 2.0 is a collection of 30 existing treebanks harmonized into a common annotation style, the Prague Dependencies, and further transformed into Stanford Dependencies, a treebank annotation style that became popular recently. We use the newest basic Universal Stanford Dependencies, without added language-specific subtypes.
- Rights:
- HamleDT 2.0 Licence Agreement, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-2.0, and ACA
36. HamleDT 3.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University
- Type:
- text and corpus
- Subject:
- annotated corpus, morphology, syntax, dependency, treebank, harmonized annotation, and common annotation style
- Language:
- Arabic, Basque, Bengali, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Ancient Greek (to 1453), Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT (HArmonized Multi-LanguagE Dependency Treebank) is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. This version uses Universal Dependencies as the common annotation style. Update (November 1017): for a current collection of harmonized dependency treebanks, we recommend using the Universal Dependencies (UD). All of the corpora that are distributed in HamleDT in full are also part of the UD project; only some corpora from the Patch group (where HamleDT provides only the harmonizing scripts but not the full corpus data) are available in HamleDT but not in UD.
- Rights:
- HamleDT 3.0 License Terms, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-3.0, and PUB
37. Intas corpus
- Publisher:
- Department of Languages, University of Jyväskylä
- Type:
- corpus
- Language:
- Dutch, Finnish, and Russian
- Description:
- A corpus of spontaneous discussions and read-aloud performances from native speakers of different ages. Parallel corpus in Russian, Finnish, and Dutch.
- Rights:
- Not specified
38. Itäisen Keski-Euroopan etninen järjestely toisen maalmansodan jälkeen /
- Creator:
- Halmesvirta, Anssi
- Subject:
- migrace, menšiny národnostní, transfery národnostní, Němci maďarští, politika mezinárodní, světové dějiny od r. 1945 do současnosti, migrace, vystěhovalectví, kolonizace, and národnosti, vztahy mezi národnostmi a národní hnutí
- Language:
- Finnish
- Description:
- The ethnic reorganisation of Eastern Central Europe after World War II.
- Rights:
- unknown
39. IWPT 2020 Shared Task Data and System Outputs
- Creator:
- Zeman, Daniel, Bouma, Gosse, and Seddah, Djamé
- Publisher:
- Universal Dependencies Consortium
- Type:
- text and corpus
- Subject:
- treebank, dependency, syntax, enhanced universal dependencies, shared task, and parsing
- Language:
- Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, French, Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish, Tamil, and Ukrainian
- Description:
- This package contains data used in the IWPT 2020 shared task. It contains training, development and test (evaluation) datasets. The data is based on a subset of Universal Dependencies release 2.5 (http://hdl.handle.net/11234/1-3105) but some treebanks contain additional enhanced annotations. Moreover, not all of these additions became part of Universal Dependencies release 2.6 (http://hdl.handle.net/11234/1-3226), which makes the shared task data unique and worth a separate release to enable later comparison with new parsing algorithms. The package also contains a number of Perl and Python scripts that have been used to process the data during preparation and during the shared task. Finally, the package includes the official primary submission of each team participating in the shared task.
- Rights:
- Licence Universal Dependencies v2.5, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.5, and PUB
40. IWPT 2021 Shared Task Data and System Outputs
- Creator:
- Zeman, Daniel, Bouma, Gosse, and Seddah, Djamé
- Publisher:
- Universal Dependencies Consortium
- Type:
- text and corpus
- Subject:
- treebank, dependency, syntax, enhanced universal dependencies, shared task, and parsing
- Language:
- Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, French, Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish, Tamil, and Ukrainian
- Description:
- This package contains data used in the IWPT 2021 shared task. It contains training, development and test (evaluation) datasets. The data is based on a subset of Universal Dependencies release 2.7 (http://hdl.handle.net/11234/1-3424) but some treebanks contain additional enhanced annotations. Moreover, not all of these additions became part of Universal Dependencies release 2.8 (http://hdl.handle.net/11234/1-3687), which makes the shared task data unique and worth a separate release to enable later comparison with new parsing algorithms. The package also contains a number of Perl and Python scripts that have been used to process the data during preparation and during the shared task. Finally, the package includes the official primary submission of each team participating in the shared task.
- Rights:
- Licence Universal Dependencies v2.7, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.7, and PUB