Zobrazit minimální záznam

 
dc.contributor.author Harašta, Jakub
dc.contributor.author Šavelka, Jaromír
dc.contributor.author Kasl, František
dc.contributor.author Míšek, Jakub
dc.date.accessioned 2019-06-25T15:28:01Z
dc.date.available 2019-06-25T15:28:01Z
dc.date.issued 2019-05-23
dc.identifier.uri http://hdl.handle.net/11372/LRT-2901
dc.description Annotated corpus of 350 decision of Czech top-tier courts (Supreme Court, Supreme Administrative Court, Constitutional Court). 280 decisions were annotated by one trained annotator and then manually adjudicated by one trained curator. 70 decisions were annotated by two trained annotators and then manually adjudicated by one trained curator. Adjudication was conducted destructively, therefore dataset contains only the correct annotations and does not contain all original annotations. Corpus was developed as training and testing material for text segmentation tasks. Dataset contains decision segmented into Header, Procedural History, Submission/Rejoinder, Court Argumentation, Footer, Footnotes, and Dissenting Opinion. Segmentation allows to treat different parts of text differently even if it contains similar linguistic or other features.
dc.language.iso ces
dc.publisher Masaryk University, Brno
dc.relation.isreferencedby http://jusletter-it.weblaw.ch/issues/2019/23-Mai-2019/automatic-segmentati_f1ab10b8b5.html
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri http://creativecommons.org/licenses/by/4.0/
dc.source.uri https://www.muni.cz/vyzkum/projekty/36467
dc.subject document segmentation
dc.subject legal texts
dc.title Annotated Corpus of Czech Case Law for Segmentation Tasks
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LRT + Open Submissions
contact.person Jakub Harašta jakub.harasta@law.muni.cz Masaryk University, Brno
sponsor Grantová agentura ČR GA17-20645S Exaktní hodnocení aplikační relevance judikatury nationalFunds
size.info 350 articles
files.size 15226360
files.count 2


 Soubory tohoto záznamu

 Stáhnout všechny soubory záznamu (14.52 MB)
Licenční kategorie:
Publicly Available

Licence: Creative Commons - Attribution 4.0 International (CC BY 4.0)
Distributed under Creative Commons Attribution Required
Icon
Název
corpus.json
Velikost
13.83 MB
Formát
Neznámý
Popis
Corpus (gold)
MD5
5de341ef2545591f10b5283ef322386e
 Stáhnout soubor
Icon
Název
ReadMe.pdf
Velikost
712.6 KB
Formát
PDF
Popis
ReadMe
MD5
ea46ea2b87576df5daedaecfcaf5e9e7
 Stáhnout soubor

Zobrazit minimální záznam