Zobrazit minimální záznam

 
dc.contributor.author Barančíková, Petra
dc.contributor.author Bojar, Ondřej
dc.date.accessioned 2020-06-19T09:13:14Z
dc.date.available 2020-06-19T09:13:14Z
dc.date.issued 2020-06-15
dc.identifier.uri http://hdl.handle.net/11234/1-3248
dc.description Costra 1.1 is a new dataset for testing geometric properties of sentence embeddings spaces. In particular, it concentrates on examining how well sentence embeddings capture complex phenomena such paraphrases, tense or generalization. The dataset is a direct expansion of Costra 1.0, which was extended with more sentences and sentence comparisons.
dc.language.iso ces
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation info:eu-repo/grantAgreement/EC/H2020/825303
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.subject paraphrases
dc.subject sentence embeddings
dc.subject evaluation
dc.subject sentence
dc.title COSTRA 1.1: A Dataset of Complex Sentence Transformations and Comparisons
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Petra Barančíková barancikova@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
sponsor European Union EC/H2020/825303 Bergamot - Browser-based Multilingual Translation euFunds info:eu-repo/grantAgreement/EC/H2020/825303
sponsor Czech Science Foundation 19-26934X Neural Representations in Multi-modal and Multi-lingual Modelling nationalFunds
size.info 6968 sentences
files.size 819686
files.count 2


 Soubory tohoto záznamu

 Stáhnout všechny soubory záznamu (800.47 KB)
Licenční kategorie:
Publicly Available

Licence: Creative Commons - Attribution 4.0 International (CC BY 4.0)
Distributed under Creative Commons Attribution Required
Icon
Název
README
Velikost
3.94 KB
Formát
Neznámý
Popis
README
MD5
ec1d7ad7c25a11b40f9496433a632a3f
 Stáhnout soubor
Icon
Název
data.tsv
Velikost
796.54 KB
Formát
Neznámý
Popis
data
MD5
e30cd60188074f3006eb5f976eddb993
 Stáhnout soubor

Zobrazit minimální záznam