« Previous |
1 - 10 of 12
|
Next »
Number of results to display per page
Search Results
2. A Human-Annotated Dataset for Language Modeling and Named Entity Recognition in Medieval Documents (2023-01-05)
- Creator:
- Novotný, Vít, Luger, Kristýna, Štefánik, Michal, Vrabcová, Tereza, and Horák, Aleš
- Publisher:
- Masaryk University, Brno
- Type:
- text and corpus
- Subject:
- NER, named entity recognition, and Medieval
- Language:
- Czech, English, German, and Latin
- Description:
- This is an open dataset of sentences from 19th and 20th century letterpress reprints of documents from the Hussite era. The dataset contains a corpus for language modeling and human annotations for named entity recognition (NER).
- Rights:
- Public Domain Dedication (CC Zero), http://creativecommons.org/publicdomain/zero/1.0/, and PUB
3. A Human-Annotated Dataset of Scanned Images and OCR Texts from Medieval Documents
- Creator:
- Novotný, Vít, Seidlová, Kristýna, Vrabcová, Tereza, and Horák, Aleš
- Publisher:
- Masaryk University, Brno
- Type:
- image and corpus
- Subject:
- ocr, optical character recognition, language identification, image super-resolution, sr, and Medieval
- Language:
- German, Czech, Latin, and English
- Description:
- This is an open dataset of scanned images and OCR texts from 19th and 20th century letterpress reprints of documents from the Hussite era. The dataset contains human annotations for layout analysis, OCR evaluation, and language identification.
- Rights:
- Public Domain Dedication (CC Zero), http://creativecommons.org/publicdomain/zero/1.0/, and PUB
4. A Human-Annotated Dataset of Scanned Images and OCR Texts from Medieval Documents: Supplementary Materials
- Creator:
- Novotný, Vít and Horák, Aleš
- Publisher:
- Masaryk University, Brno
- Type:
- text and corpus
- Subject:
- ocr, optical character recognition, language identification, image super-resolution, sr, and Medieval
- Language:
- Czech, English, German, and Latin
- Description:
- These are supplementary materials for an open dataset of scanned images and OCR texts from 19th and 20th century letterpress reprints of documents from the Hussite era. The dataset contains human annotations for layout analysis, OCR evaluation, and language identification and is available at http://hdl.handle.net/11234/1-4615. These supplementary materials contain OCR texts from different OCR engines for book pages for which we have both high-resolution scanned images and annotations for OCR evaluation.
- Rights:
- Public Domain Dedication (CC Zero), http://creativecommons.org/publicdomain/zero/1.0/, and PUB
5. Hybridization between three crested newt species (Triturus cristatus superspecies) in the Czech Republic and Slovakia: comparison of nuclear markers and mitochondrial DNA
- Creator:
- Mikulíček, Peter, Horák, Aleš, Zavadil, Vít, Kautman, Ján, and Piálek, Jaroslav
- Type:
- article, model:article, and TEXT
- Subject:
- hybrid zone, introgression, mtDNA, microsatellites, RAPD, and Salamandridae
- Language:
- English
- Description:
- Crested newts (Triturus cristatus superspecies) are a group of closely related species with parapatric distributions that are likely to interbreed where their ranges meet. Coexistence of three species of the complex (Triturus cristatus, T. dobrogicus and T. carnifex) has been recently confirmed in central Europe. In this study we aim to elucidate the distribution of crested newts in contact zones in the Czech Republic and Slovakia, and determine the extent of hybridization and introgression using nuclear (microsatellites and Randomly Amplified Polymorphic DNA, RAPD) and mitochondrial DNA (mtDNA) markers. Nuclear markers reveal hybrid zones between T. cristatus and T. dobrogicus at the foothills of the Carpathians in southern Slovakia, and between T. cristatus and T. carnifex in the southern parts of the Czech Republic. Analysis of mitochondrial cytochrome b sequences reveals T. cristatus and T. dobrogicus-specific haplotypes in contact zones in southern Slovakia. Surprisingly, most T. carnifex and individuals with mixed ancestry between T. carnifex and T. cristatus possess haplotypes specific for T. dobrogicus, most likely as a result of historical mtDNA introgression. Only one T. carnifex-specific haplotype carried by a single specimen is found in the Czech Republic. Our study shows that genetic structure of central European populations of crested newts is complex and influenced by historical and contemporary hybridization.
- Rights:
- http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
6. Phylogenetic relationships of some spirurine nematodes (Nematoda: Chromadorea: Rhabditida: Spirurina) parasitic in fishes inferred from SSU rRNA gene sequences
- Creator:
- Černotíková, Eva, Horák, Aleš, and Moravec, František
- Format:
- bez média and svazek
- Type:
- model:article and TEXT
- Subject:
- Nematoda, Spirurina, SSU rRNA, phylogeny, and taxonomy
- Language:
- English
- Description:
- Small subunit rRNA sequences were obtained from 38 representatives mainly of the nematode orders Spirurida (Camallanidae, Cystidicolidae, Daniconematidae, Philometridae, Physalopteridae, Rhabdochonidae, Skrjabillanidae) and, in part, Ascaridida (Anisakidae, Cucullanidae, Quimperiidae). The examined nematodes are predominantly parasites of fishes. Their analyses provided well-supported trees allowing the study of phylogenetic relationships among some spirurine nematodes. The present results support the placement of Cucullanidae at the base of the suborder Spirurina and, based on the position of the genus Philonema (subfamily Philoneminae) forming a sister group to Skrjabillanidae (thus Philoneminae should be elevated to Philonemidae), the paraphyly of the Philometridae. Comparison of a large number of sequences of representatives of the latter family supports the paraphyly of the genera Philometra, Philometroides and Dentiphilometra. The validity of the newly included genera Afrophilometra and Caranginema is not supported. These results indicate geographical isolation has not been the cause of speciation in this parasite group and no coevolution with fish hosts is apparent. On the contrary, the group of South-American species of Alinema, Nilonema and Rumai is placed in an independent branch, thus markedly separated from other family members. Molecular data indicate that the skrjabillanid subfamily Esocineminae (represented by Esocinema bohemicum) should be either elevated to the rank of an independent family or Daniconematidae (Mexiconema africanum) should be decreased to Daniconematinae and transferred to the family Skrjabillanidae. Camallanid genera Camallanus and Procamallanus, as well as the subgenera Procamallanus and Spirocamallanus are confirmed to be paraphyletic. Paraphyly has also been found within Filarioidea, Habronematoidea and Thelazioidea and in Cystidicolidae, Physalopteridae and Thelaziidae. The results of the analyses also show that Neoascarophis, Spinitectus and Rhabdochona are monophyletic, in contrast to the paraphyletic genus Ascarophis. They further confirm the independence of two subgenera, Rhabdochona and Globochona, in the genus Rhabdochona. The necessity of further studies of fish-parasitizing representatives of additional nematode families not yet studied by molecular methods, such as Guyanemidae, Lucionematidae or Tetanonematidae, is underscored.
- Rights:
- http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
7. Research on cooling efficiencies of water, emulsions and oil
- Creator:
- Horák, Aleš, Raudenský, Miroslav, Pohanka, Michal, Bellerová, Hana, and Reichardt, Tilo
- Format:
- bez média and svazek
- Type:
- model:article and TEXT
- Subject:
- spray cooling, water cooling, emulsion cooling, oil cooling, and experimental
- Language:
- English
- Description:
- There are many areas in the steel and metallurgy industry where pure water cannot be used as a coolant. Lubrication and corrosion are the two main factors why spray cooling has to use different cooling liquids. A typical example is cold rolling of steel where emulsions are used or rolling of some non-ferrous metals where pure oils are used. Other metallurgical processes use water polluted by oil or containing mineral salts. The spray cooling efficiency of these coolants is different from the cooln slats. The spray cooling efficiency of these coolants is different from the cooling efficiency of pure water. This paper describes a research comparing the spray cooling by pure water to the cooling using water-base oil emulsions of different concentrations, cooling using oil, and cooling using polluted water. This comparison was done by the measurements of the cooling efficiency characterised by the heat transfer coefficient for identical pressure. and Obsahuje seznam literatury
- Rights:
- http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
8. SQAD
- Creator:
- Medveď, Marek and Horák, Aleš
- Publisher:
- Masaryk University, NLP Centre
- Type:
- text and corpus
- Subject:
- question answering, Simple Question Answering Database, and SQAD
- Language:
- Czech
- Description:
- The SQAD database consists of 3301 records obtained from Czech Wikipedia articles. The record structure is following: - the original sentence(s) from Wikipedia - a question that is directly answered in the text - the expected answer to the question as it appears in the original text - the URL of the Wikipedia web page from which the original text was extracted - name of the author of this SQAD record
- Rights:
- GNU General Public Licence, version 3, http://opensource.org/licenses/GPL-3.0, and PUB
9. sqad 2.1
- Creator:
- Medveď, Marek, Horák, Aleš, and Kušniráková, Dáša
- Publisher:
- Natural Language Processing Centre, Faculty of Informatics, Masaryk University
- Type:
- text and corpus
- Subject:
- Czech, Simple Question Answering Database, and question answering
- Language:
- Czech
- Description:
- Simple question answering database version 2.1 (SQAD_v2.1) created from Czech Wikipedia. Each record of SQAD consist of four files (in vertical form provided with lemmatization and POS tagging) and two metadata files.
- Rights:
- GNU Library or "Lesser" General Public License 3.0 (LGPL-3.0), http://opensource.org/licenses/LGPL-3.0, and PUB
10. sqad 3.0
- Creator:
- Medveď, Marek and Horák, Aleš
- Publisher:
- Masaryk University, NLP Centre
- Type:
- text and corpus
- Subject:
- Simple Question Answering Database, Czech, and question answering
- Language:
- Czech
- Description:
- Simple question answering database version 3 (SQAD v3) created from Czech Wikipedia. New version consits of 13477 records. Each record of SQAD consist of multiple files - question, answer extraction, answer selection, ulr, question metadata and in some cases answer context.
- Rights:
- GNU Library or "Lesser" General Public License 3.0 (LGPL-3.0), http://opensource.org/licenses/LGPL-3.0, and PUB