Creator: Hladká, Barbora - LINDAT/CLARIAH-CZ Catalog Search Results

1. Czech Court Decisions Dataset

Creator:: Kríž, Vincent and Hladká, Barbora
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: named entities, annotation, and corpus
Language:: Czech
Description:: We present the Czech Court Decisions Dataset (CCDD) -- a dataset of 300 manually annotated court decisions published by The Supreme Court of the Czech Republic and the Constitutional Court of the Czech Republic.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

2. Czech Legal Text Treebank

Creator:: Kríž, Vincent, Hladká, Barbora, and Urešová, Zdeňka
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: treebank, corpus, Czech, legal texts, and legal domain
Language:: Czech
Description:: The Czech Legal Text Treebank (CLTT) is a collection of 1133 manually annotated dependency trees. CLTT consists of two legal documents: The Accounting Act (563/1991 Coll., as amended) and Decree on Double-entry Accounting for undertakers (500/2002 Coll., as amended).
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

3. Czech Legal Text Treebank 2.0

Creator:: Kríž, Vincent and Hladká, Barbora
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: treebank, Prague dependencies, named entities, and semantic relations
Language:: Czech
Description:: The Czech Legal Text Treebank 2.0 (CLTT 2.0) annotates the same texts as the CLTT 1.0. These texts come from the legal domain and they are manually syntactically annotated. The CLTT 2.0 annotation on the syntactic layer is more elaborate than in the CLTT 1.0 from various aspects. In addition, new annotation layers were added to the data: (i) the layer of accounting entities, and (ii) the layer of semantic entity relations.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

4. Czesl - Universal Dependencies Release 0.5

Creator:: Hana, Jiří and Hladká, Barbora
Publisher:: Charles University
Type:: text and corpus
Subject:: learner corpus, syntactic annotation, and universal dependencies
Language:: Czech
Description:: Syntactic annotation of 1600 sentences from the Czesl-MAN corpus using the framework of Universal Dependencies 2.3
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

5. Information extraction from EIA documents

Creator:: Lukšová, Ivana and Hladká, Barbora
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: information extraction and rule-based extraction
Language:: Czech
Description:: Environmental impact assessment (EIA) is the formal process used to predict the environmental consequences of a plan. We present a rule-based extraction system to mine Czech EIA documents. The extraction rules work with a set of documents enriched with morphological information and manually created vocabularies of terms supposed to be extracted from the documents, e.g. basic information about the project (address, ID company, ...), data on the impacts and outcomes (waste substances, endangered species, ...), a final opinion. The documents Notice of Intent contains the section BI2 with the information on the scope (capacity) of the plan.
Rights:: GNU General Public Licence, version 3, http://opensource.org/licenses/GPL-3.0, and PUB

6. KUK 0.0

Creator:: Hladká, Barbora, Cinková, Silvie, Kuk, Michal, Mírovský, Jiří, Novotná, Tereza, and Zahálková, Kristýna Nguyen
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: legal texts and court decisions
Language:: Czech
Description:: KUK 0.0 is a pilot version of a corpus of Czech legal and administrative texts designated as data for manual and automatic assessment of accessibility (comprehensibility or clarity) of Czech legal texts.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

7. Lingvistika na Matematicko-fyzikální fakultě?

Creator:: Hajič, Jan, Hladká, Barbora, and Panevová, Jarmila
Type:: article, model:article, and TEXT
Language:: Czech
Rights:: http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public

8. Migrant Stories

Creator:: Hájek, Martin, Mírovský, Jiří, and Hladká, Barbora
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: corpus
Language:: English
Description:: Migrant Stories is a corpus of 1017 short biographic narratives of migrants supplemented with meta information about countries of origin/destination, the migrant gender, GDP per capita of the respective countries, etc. The corpus has been compiled as a teaching material for data analysis.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

9. ParCzech 3.0

Creator:: Kopp, Matyáš, Stankov, Vladislav, Bojar, Ondřej, Hladká, Barbora, and Straňák, Pavel
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: audio and corpus
Subject:: Parliament of the Czech Republic, Chamber of Deputies, stenographic protocols, TEI encoding, and speech corpus
Language:: Czech
Description:: The ParCzech 3.0 corpus is the third version of ParCzech consisting of stenographic protocols that record the Chamber of Deputies’ meetings held in the 7th term (2013-2017) and the current 8th term (2017-Mar 2021). The protocols are provided in their original HTML format, Parla-CLARIN TEI format, and the format suitable for Automatic Speech Recognition. The corpus is automatically enriched with the morphological, syntactic, and named-entity annotations using the procedures UDPipe 2 and NameTag 2. The audio files are aligned with the texts in the annotated TEI files.
Rights:: Public Domain Dedication (CC Zero), http://creativecommons.org/publicdomain/zero/1.0/, and PUB

10. ParCzech PS7 1.0

Creator:: Hladká, Barbora, Kopp, Matyáš, and Straňák, Pavel
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: Parliament of the Czech Republic, Chamber of Deputies, stenographic protocols, TEI encoding, and TEITOK
Language:: Czech
Description:: The ParCzech PS7 1.0 corpus is the very first member of the corpus family of data coming from the Parliament of the Czech Republic. ParCzech PS7 1.0 consists of stenographic protocols that record the Chamber of Deputies' meetings held in the 7th term between 2013-2017. The audio recordings are available as well. Transcripts are provided in the original HTML as harvested, and also converted into TEI-derived XML format for use in TEITOK corpus manager. The corpus is automatically enriched with the morphological and named-entity annotations using the procedures MorphoDita and NameTag.
Rights:: Public Domain Dedication (CC Zero), http://creativecommons.org/publicdomain/zero/1.0/, and PUB

1. Czech Court Decisions Dataset

2. Czech Legal Text Treebank

3. Czech Legal Text Treebank 2.0

4. Czesl - Universal Dependencies Release 0.5

5. Information extraction from EIA documents

6. KUK 0.0

7. Lingvistika na Matematicko-fyzikální fakultě?

8. Migrant Stories

9. ParCzech 3.0

10. ParCzech PS7 1.0

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Creator

Show values starting with

Language

Show values starting with

Publisher

Rights

Show values starting with

Subject

Show values starting with

Type

Date

Original context has metadata only

Harvested from