Creator: Spousta, Miroslav - LINDAT/CLARIAH-CZ Catalog Search Results

Creator:: Spoustová, Johanka and Spousta, Miroslav
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: corpus, Czech, and web
Language:: Czech
Description:: Web corpus of Czech, created in 2011. Contains newspapers+magazines, discussions, blogs. See http://www.lrec-conf.org/proceedings/lrec2012/summaries/120.html for details. and GA405/09/0278
Rights:: Creative Commons - Attribution 3.0 Unported (CC BY 3.0), http://creativecommons.org/licenses/by/3.0/, and PUB

Creator:: Hajič, Jan, Pajas, Petr, Ircing, Pavel, Romportl, Jan, Peterek, Nino, Spousta, Miroslav, Mikulová, Marie, Grůber, Martin, and Legát, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) and University of West Bohemia
Type:: audio and corpus
Subject:: spoken corpus, speech recognition, and speech reconstruction
Language:: Czech
Description:: PDTSC 1.0 is a multi-purpose corpus of spoken language. 768,888 tokens, 73,374 sentences and 7,324 minutes of spontaneous dialog speech have been recorded, transcribed and edited in several interlinked layers: audio recordings, automatic and manual transcription and manually reconstructed text. PDTSC 1.0 is a delayed release of data annotated in 2012. It is an update of Prague Dependency Treebank of Spoken Language (PDTSL) 0.5 (published in 2009). In 2017, Prague Dependency Treebank of Spoken Czech (PDTSC) 2.0 was published as an update of PDTSC 1.0.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

Creator:: Spousta, Miroslav
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: toolService
Subject:: web page processing
Description:: Victoria is an on-line HTML web page annotation tool suitable for selecting texts on the web pages. It can be used to mark important/interesting parts of web pages for further processing.
Rights:: GNU General Public License, version 2, http://www.gnu.org/licenses/gpl-2.0.html, and PUB

Search