CoNLL 2017 and 2018 shared tasks:
Multilingual Parsing from Raw Text to Universal Dependencies
This package contains the test data in the form in which they ware presented
to the participating systems: raw text files and files preprocessed by UDPipe.
The metadata.json files contain lists of files to process and to output;
README files in the respective folders describe the syntax of metadata.json.
For full training, development and gold standard test data, see
Universal Dependencies 2.0 (CoNLL 2017)
Universal Dependencies 2.2 (CoNLL 2018)
See the download links at http://universaldependencies.org/.
For more information on the shared tasks, see
http://universaldependencies.org/conll17/
http://universaldependencies.org/conll18/
Contents:
conll17-ud-test-2017-05-09 ... CoNLL 2017 test data
conll18-ud-test-2018-05-06 ... CoNLL 2018 test data
conll18-ud-test-2018-05-06-for-conll17 ... CoNLL 2018 test data with metadata
and filenames modified so that it is digestible by the 2017 systems.
Baseline UDPipe models for CoNLL 2017 Shared Task in UD Parsing, and supplementary material.
The models require UDPipe version at least 1.1 and are evaluated using the official evaluation script.
The models are trained on a slightly different split of the official UD 2.0 CoNLL 2017 training data, so called baselinemodel split, in order to allow comparison of models even during the shared task. This baselinemodel split of UD 2.0 CoNLL 2017 training data is available for download.
Furthermore, we also provide UD 2.0 CoNLL 2017 training data with automatically predicted morphology. We utilize the baseline models on development data and perform 10-fold jack-knifing (each fold is predicted with a model trained on the rest of the folds) on the training data.
Finally, we supply all required data and hyperparameter values needed to replicate the baseline models.
Baseline UDPipe models for CoNLL 2018 Shared Task in UD Parsing, and supplementary material.
The models require UDPipe version at least 1.2 and are evaluated using the official evaluation script. The models were trained using a custom data split for treebanks where no development data is provided. Also, we trained an additional "Mixed" model, which uses 200 sentences from every training data. All information needed to replicate the model training (hyperparameters, modified train-dev split, and pre-computed word embeddings for the parser) are included in the archive.
Additionaly, we provide UD 2.2 CoNLL 2018 training data with automatically predicted morphology. We utilize the baseline models on development data and perform 10-fold jack-knifing (each fold is predicted with a model trained on the rest of the folds) on the training data.
Conventional methods to preserve adult nematodes for taxonomic purposes involve the use of fixative or clearing solutions (alcohol, formaldehyde, AFA and lactophenol), which cause morphological alterations and are toxic. The aim of this study is to propose an alternative method based on glycerol-cryopreservation of nematodes for their subsequent identification. Adults of trichostrongylid nematodes from the abomasum of roe deer (Capreolus capreolus Linnaeus) were glycerol-cryopreserved and compared with those fixed in formaldehyde, fresh and frozen without cryoprotectans. Morphology, transparency and elasticity of the anterior and posterior portion of male nematodes were compared, especially the caudal cuticular bursa and genital accessories. The method presented is quick and easy to use, and the quality of nematode specimens is better than that of nematodes fixed by previously used fixatives. Moreover, glycerol cryopreserved nematodes can be stored for a long time at -20°C in perfect condition and they could be suitable for further analyses, such as histological or ultrastructural examinations.
Czech OOV Inflection Dataset is a Czech inflection dataset of nouns, focused on evaluation in out-of-vocabulary (OOV) conditions. It consists of two parts: a standard lemma-disjoint train-dev-test split of a subset of noun paradigms of existing morphological dictionary Czech MorfFlex 2.0 (files train, dev and test-MorfFlex); and small set of neologisms from Čeština 2.0, annotated for inflected forms (file test-neologisms).
Morphology of the nematode Viguiera dicrurusi Gupta, 1960 harboured by Dicrurus macrocercus albirictus (Hodgson) (Passseriformes: Dicruridae) from Baruipara in 24-Pargonas (South) district, West Bengal, India was studied by light and scanning electron microscopy (SEM). This represents the first study of V. dicrurusi using SEM. Scanning electron micrographs provided detailed information about the nature of pseudolabial plates, number and shape of teeth, dentate nature of striae, and the relative position of vulva, anus and phasmid opening in female. A detailed morphometrical comparison of this species with Viguiera viduae Chabaud, 1960 described from Dicrurus forficatus from Madagascar indicates that V. viduae is a junior synonym of V. dicrurusi. Two other species, Viguiera bhujangai Jehan, 1972 and Viguiera adsimilisai Sood et Kalia, 1978 are considered species inquirendae.
Demodex neomydis sp. n. from the Mediterranean water shrew, Neomys anomalus, is described as a new species in all developmental stages. This demodecid is classified as a member of the genus Demodex Owen, 1843, but shows several morphological characters described in Soricidex dimorphus Bukva, 1982 and which are absent or very infrequent in other known Demodex species, viz., in the adult stage, a pair of shelf-like lamellae on the dorsum of the podosoma, dorso-lateral extension of the podosoma over the basal part of the gnathosoma, multiple opisthosomal organ in the male, and podosomal position of the vulva in the female. Immature stages of D. neomydis have unusual inflated idiosoma and dorsad deflected gnathosoma. All developmental stages of D. neomydis were found in the lumen of the hair follicles on the host’s muzzle, causing no gross pathological response. On histological level, the main pathological change was distension of infested hair follicles by accumulations of up to a dozen mites, which appear to feed on the epithelial cells of the hair follicle walls.
Parasitological examination of faeces of 26 snakes kept in Bio-Ken Snake Farm, Watamu, Kenya revealed new species of Eimeria Schneider, 1875 in Telescopus semiannulatus Smith, 1849. Oocysts of Eimeria arabukosokokensis sp. n. are cylindrical 26.8 (25-29) × 15.1 (14-16) µm with smooth, bilayered oocyst wall and a single polar granule. The broadly ellipsoidal sporocysts average 9.3 (8.5-10) × 7.1 (6.5-7.5) µm and possess single-layered wall composed of two plates joined by longitudinal suture. Caryospora cf. regentensis Daszak et Ball, 2001 is reported from Dendroaspis angusticeps (Smith, 1849) and two additional forms of Caryospora Léger, 1904 are reported and morphologically characterised from a single specimen of Psammophis orientalis Broadley, 1977. Systematic status of Caryospora spp. in sub-Saharan Psammophis Boie, 1827 is discusses and all species reported by various authors to date are suggested to be treated as species inquirendae until more detailed data on these parasites and their hosts are available.
First and third instar larvae of Aepopsis robini (Laboulbène, 1849) are studied, redescribed, and illustrated. The larvae are characterised by three unique and likely autapomorphic character states within known members of the supertribe Trechitae: (1) apex of antennomere 4 has only one conical sensillun 1; (2) setae FR10 and FR11 on frontale are removed basally on dorsal surface from the apical margin; (3) terga of meso- and metathorax lack pore MEa, and abdominal terga 1-8 lack pore TEa.