Harvested from: LINDAT/CLARIAH-CZ repository - LINDAT/CLARIAH-CZ Catalog Search Results

41. AKCES 5 (CzeSL-SGT) Release 2

Creator:: Šebesta, Karel, Bedřichová, Zuzanna, Šormová, Kateřina, Štindlová, Barbora, Hrdlička, Milan, Hrdličková, Tereza, Hana, Jiří, Petkevič, Vladimír, Jelínek, Tomáš, Škodová, Svatava, Poláčková, Marie, Janeš, Petr, Lundáková, Kateřina, Skoumalová, Hana, Sládek, Šimon, Pierscieniak, Piotr, Toufarová, Dagmar, Richter, Michal, Straka, Milan, and Rosen, Alexandr
Publisher:: Charles University
Type:: text and corpus
Subject:: learner corpus, Czech as a foreign language, Czech language acquisition corpora, AKCES, non-native speakers, and second language acquistion
Language:: Czech
Description:: Essays written by non-native learners of Czech, a part of AKCES/CLAC – Czech Language Acquisition Corpora. CzeSL-SGT stands for Czech as a Second Language with Spelling, Grammar and Tags. Extends the “foreign” (ciz) part of AKCES 3 (CzeSL-plain) by texts collected in 2013. Original forms and automatic corrections are tagged, lemmatized and assigned erros labels. Most texts have metadata attributes (30 items) about the author and the text. In addition to a few minor bugs, fixes a critical issue in Release 1: the native speakers of Ukrainian (s_L1:"uk") were wrongly labelled as speakers of "other European languages" (s_L1_group="IE"), instead of speakers of a Slavic language (s_L1_group="S"). The file is now a regular XML document, with all annotation represented as XML attributes.
Rights:: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0), http://creativecommons.org/licenses/by-sa/3.0/, and PUB

42. AKCES-GEC Grammatical Error Correction Dataset for Czech

Creator:: Šebesta, Karel, Bedřichová, Zuzanna, Šormová, Kateřina, Štindlová, Barbora, Hrdlička, Milan, Hrdličková, Tereza, Hana, Jiří, Petkevič, Vladimír, Jelínek, Tomáš, Škodová, Svatava, Janeš, Petr, Lundáková, Kateřina, Skoumalová, Hana, Sládek, Šimon, Pierscieniak, Piotr, Toufarová, Dagmar, Straka, Milan, Rosen, Alexandr, Náplava, Jakub, and Poláčková, Marie
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: natural language correction, grammatical error correction, and gec
Language:: Czech
Description:: AKCES-GEC is a grammar error correction corpus for Czech generated from a subset of AKCES. It contains train, dev and test files annotated in M2 format. Note that in comparison to CZESL-GEC dataset, this dataset contains separated edits together with their type annotations in M2 format and also has two times more sentences. If you use this dataset, please use following citation: @article{naplava2019wnut, title={Grammatical Error Correction in Low-Resource Scenarios}, author={N{\'a}plava, Jakub and Straka, Milan}, journal={arXiv preprint arXiv:1910.00353}, year={2019} }
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

46. AlbMoRe Movie Reviews in Albanian

Creator:: Çano, Erion
Publisher:: University of Vienna
Type:: text and corpus
Subject:: sentiment analysis, under-resourced language, and albanian language
Language:: Albanian
Description:: AlbMoRe is a sentiment analysis corpus of movie reviews in Albanian, consisting of 800 records in CSV format. Each record includes a text review retrieved from IMDb and translated in Albanian by the author. It also contains a 0 negative) or 1 (positive) label added by the author. The corpus is fully balanced, consisting of 400 positive and 400 negative reviews about 67 movies of different genres. AlbMoRe corpus is released under CC-BY license (https://creativecommons.org/licenses/by/4.0/). If using the data, please cite the following paper: Çano Erion. AlbMoRe: A Corpus of Movie Reviews for Sentiment Analysis in Albanian. CoRR, abs/2306.08526, 2023. URL https://arxiv.org/abs/2306.08526.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), PUB, and http://creativecommons.org/licenses/by/4.0/

47. AlbNER Named Entity Recognition in Albanian

Creator:: Çano, Erion
Publisher:: University of Vienna
Type:: text and corpus
Subject:: named entity recognition, under-resourced languages, and albanian language
Language:: Albanian
Description:: AlbNER is a Named Entity Recognition corpus of Wikipedia sentences in Albanian, consisting of 900 records. The sentence tokens are manually labeled complying with the CoNLL-2003 shared task annotation scheme explained at https://aclanthology.org/W03-0419.pdf that uses I-ORG, B-ORG, I-PER, B-PER, I-LOC, B-LOC, I-MISC, B-MISC and O tags. AlbNER data are released under CC-BY license (https://creativecommons.org/licenses/by/4.0/). If using AlbMoRe corpus, please cite the following paper: Çano Erion. AlbNER: A Corpus for Named Entity Recognition in Albanian. CoRR, abs/2309.08741, 2023. URL https://arxiv.org/abs/2309.08741.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

48. AlbNews Albanian Topic Modeling

Creator:: Çano, Erion
Publisher:: University of Vienna
Type:: text and corpus
Subject:: under-resourced language, albanian language, and topic modeling
Language:: Albanian
Description:: AlbNews is a topic modeling corpus of news headlines in Albanian, consisting of 600 labeled samples and 2600 unlabeled samples. Each labeled sample includes a headline text retrieved from Albanian online news portals. It also contains one of the four labels: 'pol' for politics, 'cul' for culture, 'eco' for economy, and 'spo' for sport. Each of the unlabeled samples contain a headline text only.AlbTopic corpus is released under CC-BY 4.0 license (https://creativecommons.org/licenses/by/4.0/). If using the data, please cite the following paper: Çano Erion, Lamaj Dario. AlbNews: A Corpus of Headlines for Topic Modeling in Albanian. CoRR, abs/2402.04028, 2024. URL: https://arxiv.org/abs/2402.04028.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

49. Alena Poláková (pianist)

Creator:: Veselý, Bohumil
Publisher:: Národní filmový archiv
Type:: video and clip
Subject:: Galerie osobností, Places::Praha::Nové Město::Školská::pavlač domu, Places::Praha::Nové Město::Národní divadlo::zadní vchod, and People::Poláková Alena (1927-)
Language:: No linguistic content
Description:: Pianist Alena Poláková on Bohumil Veselý's balcony.
Rights:: http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

50. Alex Adolf Jelínek (painter, writer)

Creator:: Veselý, Bohumil
Publisher:: Národní filmový archiv
Type:: video and clip
Subject:: Galerie osobností, Places::Praha::Břevnov::klášter, Places::Praha::Břevnov::Říčanova 876/14::terasa domu, Places::Praha::Břevnov::Říčanova, and People::Alex Jelínek Rudolf (1890-1957)
Language:: No linguistic content
Description:: Painter and writer Adolf Alex Jelínek in front of a house in Prague-Břevnov. Jelínek and his wife on the terrace.
Rights:: http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

41. AKCES 5 (CzeSL-SGT) Release 2

42. AKCES-GEC Grammatical Error Correction Dataset for Czech

43. Albert Pilát (botanist)

44. Albert Pražák (historian)

45. Alberto Vojtěch Frič (explorer)

46. AlbMoRe Movie Reviews in Albanian

47. AlbNER Named Entity Recognition in Albanian

48. AlbNews Albanian Topic Modeling

49. Alena Poláková (pianist)

50. Alex Adolf Jelínek (painter, writer)

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Date

Original context has metadata only

Harvested from