« Previous |
1 - 10 of 14
|
Next »
Number of results to display per page
Search Results
2. CoDipA UNSC 1.0
- Creator:
- Anisimova, Mariia and Zikánová, Šárka
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- Appraisal theory, diplomatic discourse, Corpus Linguisitics, and CoDipA UNSC
- Language:
- English
- Description:
- CoDipA UNSC 1.0, or a Corpus of Diplomatic Attitudes of the United Nations Security Council is a language resource manually annotated with the attitude-part of Appraisal theory. The speeches were selected according to topic-related and temporal criteria, and are representative of 5 major international military conflicts that have occurred between 1995 and 2020. The texts were annotated according to the predefined annotation scenario, which is based on the original Appraisal theory and later available commentaries on specificity of its implementation. The annotated texts are available in JSON Lines format. The corpus also contains double annotations of the 8 selected speeches.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
3. Czech RST Discourse Treebank 1.0
- Creator:
- Poláková, Lucie, Zikánová, Šárka, Mírovský, Jiří, and Hajičová, Eva
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- discourse, discourse annotation, and annotated corpus
- Language:
- Czech
- Description:
- The Czech RST Discourse Treebank 1.0 (CzRST-DT 1.0) is a dataset of 54 Czech journalistic texts manually annotated using the Rhetorical Structure Theory (RST). Each text document in the treebank is represented as a single tree-like structure, the nodes (discourse units) are interconnected through hierarchical rhetorical relations. The dataset also contains concurrent annotations of five double-annotated documents. The original texts are a part of the data annotated in the Prague Dependency Treebank, although the two projects are independent.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
4. Enriched Discourse Annotation of PDiT Subset 1.0 (PDiT-EDA 1.0)
- Creator:
- Zikánová, Šárka, Synková, Pavlína, and Mírovský, Jiří
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- discourse annotation and implicit discourse relations
- Language:
- Czech
- Description:
- Enriched discourse annotation of a subset of the Prague Discourse Treebank, adding implicit relations, entity based relations, question-answer relations and other discourse structuring phenomena.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
5. Pojetí národnostní v Havlíčkových "Obrazech z Rus" /
- Creator:
- Zikánová, Šárka
- Subject:
- Havlíček Borovský, Karel,, pojetí národnostní, literatura česká, vztahy polsko-ruské, světové dějiny 1789-1918, Rusko, SSSR, národnosti, vztahy mezi národnostmi a národní hnutí, české země 1792-1847, and literatura, spisovatelé
- Language:
- Czech
- Description:
- Conception of nationalities in Havlíček's "Pictures from Russia".
- Rights:
- unknown
6. Postavení slovesného přísudku ve starší češtině (1500-1620) /
- Creator:
- Zikánová, Šárka
- Publisher:
- Karolinum,
- Subject:
- lingvistika historická, slovosled, jazyk český, české země 1471-1526, české země 1526-1620, and jazyk, písmo
- Language:
- Czech
- Rights:
- unknown
7. Prague Dependency Treebank - Consolidated 1.0 (PDT-C 1.0)
- Creator:
- Hajič, Jan, Bejček, Eduard, Bémová, Alevtina, Buráňová, Eva, Fučíková, Eva, Hajičová, Eva, Havelka, Jiří, Hlaváčová, Jaroslava, Homola, Petr, Ircing, Pavel, Kárník, Jiří, Kettnerová, Václava, Klyueva, Natalia, Kolářová, Veronika, Kučová, Lucie, Lopatková, Markéta, Mareček, David, Mikulová, Marie, Mírovský, Jiří, Nedoluzhko, Anna, Novák, Michal, Pajas, Petr, Panevová, Jarmila, Peterek, Nino, Poláková, Lucie, Popel, Martin, Popelka, Jan, Romportl, Jan, Rysová, Magdaléna, Semecký, Jiří, Sgall, Petr, Spoustová, Johanka, Straka, Milan, Straňák, Pavel, Synková, Pavlína, Ševčíková, Magda, Šindlerová, Jana, Štěpánek, Jan, Štěpánková, Barbora, Toman, Josef, Urešová, Zdeňka, Vidová Hladká, Barbora, Zeman, Daniel, Zikánová, Šárka, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, dependency, tectogrammatics, topic-focus articulation, multiword expressions, coreference, bridging relations, discourse, morphology, syntax, tokenization, lemmatization, semantic relations, lexical semantics, lexicon, valency, speech reconstruction, clauses, speech recognition, and spoken corpus
- Language:
- Czech
- Description:
- A richly annotated and genre-diversified language resource, The Prague Dependency Treebank – Consolidated 1.0 (PDT-C 1.0, or PDT-C in short in the sequel) is a consolidated release of the existing PDT-corpora of Czech data, uniformly annotated using the standard PDT scheme. PDT-corpora included in PDT-C: Prague Dependency Treebank (the original PDT contents, written newspaper and journal texts from three genres); Czech part of Prague Czech-English Dependency Treebank (translated financial texts, from English), Prague Dependency Treebank of Spoken Czech (spoken data, including audio and transcripts and multiple speech reconstruction annotation); PDT-Faust (user-generated texts). The difference from the separately published original treebanks can be briefly described as follows: it is published in one package, to allow easier data handling for all the datasets; the data is enhanced with a manual linguistic annotation at the morphological layer and new version of morphological dictionary is enclosed; a common valency lexicon for all four original parts is enclosed. Documentation provides two browsing and editing desktop tools (TrEd and MEd) and the corpus is also available online for searching using PML-TQ.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
8. Prague Dependency Treebank 3.0
- Creator:
- Bejček, Eduard, Hajičová, Eva, Hajič, Jan, Jínová, Pavlína, Kettnerová, Václava, Kolářová, Veronika, Mikulová, Marie, Mírovský, Jiří, Nedoluzhko, Anna, Panevová, Jarmila, Poláková, Lucie, Ševčíková, Magda, Štěpánek, Jan, and Zikánová, Šárka
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, dependency, tectogrammatics, topic-focus articulation, multiword expressions, coreference, bridging relations, discourse, and PDT
- Language:
- Czech
- Description:
- PDT 3.0 is a new version of Prague Dependency Treebank. It contains a large amount of Czech texts with complex and interlinked morphological (2 million words), syntactic (1.5 MW) and semantic annotation (0.8 MW); in addition, certain properties of sentence information structure, multiword expressions, coreference, bridging relations and discourse relations are annotated at the semantic level. and the Grant Agency of the Czech Republic: grants P406/12/0658 "Coreference, discourse relations and information structure in a contrastive perspective", P406/2010/0875 "Computational Linguistics: Explicit description of language and annotated data focused on Czech", 405/09/0729 "From the structure of a sentence to textual relationships", and GPP406/12/P175 (Selected derivational relations for automatic processing of Czech); the Ministry of Education, Youth and Sports of the Czech Republic: the KONTAKT project ME10018 "Towards a computational analysis of text structure" and the LINDAT-Clarin project LM2010013; the Grant Agency of Charles University in Prague: GAUK 103609 "Textual (Inter-sentential) Relations and their Representation in a Language Corpus" and GAUK 4383/2009 "Methods of coreference resolution".
- Rights:
- Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB
9. Prague Dependency Treebank 3.5
- Creator:
- Hajič, Jan, Bejček, Eduard, Bémová, Alevtina, Buráňová, Eva, Hajičová, Eva, Havelka, Jiří, Homola, Petr, Kárník, Jiří, Kettnerová, Václava, Klyueva, Natalia, Kolářová, Veronika, Kučová, Lucie, Lopatková, Markéta, Mikulová, Marie, Mírovský, Jiří, Nedoluzhko, Anna, Pajas, Petr, Panevová, Jarmila, Poláková, Lucie, Rysová, Magdaléna, Sgall, Petr, Spoustová, Johanka, Straňák, Pavel, Synková, Pavlína, Ševčíková, Magda, Štěpánek, Jan, Urešová, Zdeňka, Vidová Hladká, Barbora, Zeman, Daniel, Zikánová, Šárka, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, dependency, tectogrammatics, topic-focus articulation, multiword expressions, coreference, bridging relations, discourse, morphology, syntax, tokenization, lemmatization, clauses, semantics, semantic relations, lexical semantics, and lexicon
- Language:
- Czech
- Description:
- The Prague Dependency Treebank 3.5 is the 2018 edition of the core Prague Dependency Treebank (PDT). It contains all PDT annotation made at the Institute of Formal and Applied Linguistics under various projects between 1996 and 2018 on the original texts, i.e., all annotation from PDT 1.0, PDT 2.0, PDT 2.5, PDT 3.0, PDiT 1.0 and PDiT 2.0, plus corrections, new structure of basic documentation and new list of authors covering all previous editions. The Prague Dependency Treebank 3.5 (PDT 3.5) contains the same texts as the previous versions since 2.0; there are 49,431 annotated sentences (832,823 words) on all layers, from tectogrammatical annotation to syntax to morphology. There are additional annotated sentences for syntax and morphology; the totals for the lower layers of annotation are: 87,913 sentences with 1,502,976 words at the analytical layer (surface dependency syntax) and 115,844 sentences with 1,956,693 words at the morphological layer of annotation (these totals include the annotation with the higher layers annotated as well). Closely linked to the tectogrammatical layer is the annotation of sentence information structure, multiword expressions, coreference, bridging relations and discourse relations.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
10. Prague Discourse Treebank 1.0
- Creator:
- Poláková, Lucie, Jínová, Pavlína, Zikánová, Šárka, Hajičová, Eva, Mírovský, Jiří, Nedoluzhko, Anna, Rysová, Magdaléna, Pavlíková, Veronika, Zdeňková, Jana, Pergler, Jiří, and Ocelák, Radek
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- discourse, treebank, and annotation
- Language:
- Czech
- Description:
- Annotation of discourse relations is a project related to the Prague Dependency Treebank 2.5. It represents a new manually annotated layer of language description, above the existing layers of the PDT, and it portrays linguistic phenomena from the perspective of discourse structure and coherence. and GACR P406/12/0658, GACR P406/2010/0875, GACR 405/09/0729, Ministry of Education ME10018, Ministry of Education LM2010013
- Rights:
- Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB