This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 
Please use the following text to cite this item or export to a predefined format:
Picek, Lukáš; Bolon, Isabelle; Durso, Andrew M. and Castañeda, Rafael Ruiz de, 2021, SnakeCLEF 2021, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/20.500.12800/1-4773.
dc.contributor.authorPicek, Lukáš
dc.contributor.authorBolon, Isabelle
dc.contributor.authorDurso, Andrew M.
dc.contributor.authorCastañeda, Rafael Ruiz de
dc.date.accessioned2022-06-21T12:11:15Z
dc.date.available2022-06-21T12:11:15Z
dc.date.issued2021-01-01
dc.descriptionThe dataset with 409,679 images belonging to 772 snake species from 188 countries and all continents (386,006 images with labels targeted for development and 23,673 images without labels for testing). In addition, we provide a simple train/val (90% / 10%) split to validate preliminary results while ensuring the same species distributions. Furthermore, we prepared a compact subset (70,208 images) for fast prototyping. The test set data consists of 23,673 images submitted to the iNaturalist platform within the "first four months of 2021. All data were gathered from online biodiversity platforms (i.e., iNaturalist, HerpMapper) and further extended by data scraped from Flickr. The provided dataset has a heavy long-tailed class distribution, where the most frequent species (Thamnophis sirtalis) is represented by 22,163 images and the least frequent by just 10 (Achalinus formosanus).
dc.identifier.urihttp://hdl.handle.net/20.500.12800/1-4773
dc.language.isozxx
dc.publisherCEUR Workshop Proceedings (CEUR-WS.org)
dc.relation.isreferencedbyhttps://dspace5.zcu.cz/bitstream/11025/47274/1/paper-125.pdf
dc.rightsBSD 3-Clause "New" or "Revised" license
dc.rights.labelPUB
dc.rights.urihttp://opensource.org/licenses/BSD-3-Clause
dc.subjectLifeCLEF
dc.subjectSnakeCLEF
dc.subjectglobal health
dc.subjectepidemiology
dc.subjectsnake bite
dc.subjectsnake
dc.subjectreptile
dc.subjectbenchmark
dc.subjectbiodiversity
dc.subjectmachine learning
dc.subjectcomputer vision
dc.subjectClassification
dc.titleSnakeCLEF 2021
dc.typecorpus
edm.typeIMAGE
local.brandingLINDAT / CLARIAH-CZ
local.contact.personLukáš Picek lukaspicek@gmail.com University of West Bohemia, Department of Cybernetics
local.dataProvideriNaturalist, HerpMapper, Flickr
local.files.count6
local.files.size65398371213
local.has.filesyes
local.language.nameNolinguistic content
local.sponsornationalFunds LM2018101 Ministerstvo školství, mládeže a tělovýchovy České republiky LINDAT/CLARIAH-CZ: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy
local.sponsorOther University of West Bohemia SGS-2019-02 Studentská Grantová Soutež
local.sponsornationalFunds QS04-20 Universitaires de Genève Fondation privée des Hôpitaux Universitaires de Genève
This item isPublicly Available
and licensed under:
 Files in this item
Name
ViT_PreProcessing-ops11-preprocessing-int-dynam_graph.onnx
Size
1.13 GB
Format
application/octet-stream
Description
Unknown
MD5
838ae0f8ed19ffeee89aba5fa8b50956
Preview
  File Preview
Name
training_data.tar.gz
Size
59.69 GB
Format
application/x-gzip
Description
gzip Archive
MD5
a39f0a73e6f35b4222e1aa1878ef20ca
Preview
  File Preview
Name
min-train_metadata.csv
Size
13.1 MB
Format
application/octet-stream
Description
Unknown
MD5
52ec43c8595a3835da6aa16364aff75e
Preview
  File Preview
Name
train_metadata.csv
Size
73.24 MB
Format
application/octet-stream
Description
Unknown
MD5
f981dd4289cac2ea05f472b584668beb
Preview
  File Preview
Name
BaseLine-EfficientNet-B0-224.ipynb
Size
29.32 KB
Format
application/octet-stream
Description
Unknown
MD5
1fff5bc5ef4c896d6b0ab12cb2c9a37f
Preview
  File Preview
Name
species_to_country_mapping.csv
Size
1.48 MB
Format
application/octet-stream
Description
Unknown
MD5
7f45509480abef9df3f69e6685dc24ed
Preview
  File Preview