dc.contributor.author | Harašta, Jakub |
dc.contributor.author | Šavelka, Jaromír |
dc.contributor.author | Kasl, František |
dc.contributor.author | Míšek, Jakub |
dc.date.accessioned | 2019-06-25T15:28:01Z |
dc.date.available | 2019-06-25T15:28:01Z |
dc.date.issued | 2019-05-23 |
dc.identifier.uri | http://hdl.handle.net/11372/LRT-2901 |
dc.description | Annotated corpus of 350 decision of Czech top-tier courts (Supreme Court, Supreme Administrative Court, Constitutional Court). 280 decisions were annotated by one trained annotator and then manually adjudicated by one trained curator. 70 decisions were annotated by two trained annotators and then manually adjudicated by one trained curator. Adjudication was conducted destructively, therefore dataset contains only the correct annotations and does not contain all original annotations. Corpus was developed as training and testing material for text segmentation tasks. Dataset contains decision segmented into Header, Procedural History, Submission/Rejoinder, Court Argumentation, Footer, Footnotes, and Dissenting Opinion. Segmentation allows to treat different parts of text differently even if it contains similar linguistic or other features. |
dc.language.iso | ces |
dc.publisher | Masaryk University, Brno |
dc.relation.isreferencedby | http://jusletter-it.weblaw.ch/issues/2019/23-Mai-2019/automatic-segmentati_f1ab10b8b5.html |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ |
dc.source.uri | https://www.muni.cz/vyzkum/projekty/36467 |
dc.subject | document segmentation |
dc.subject | legal texts |
dc.title | Annotated Corpus of Czech Case Law for Segmentation Tasks |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LRT + Open Submissions |
contact.person | Jakub Harašta jakub.harasta@law.muni.cz Masaryk University, Brno |
sponsor | Grantová agentura ČR GA17-20645S Exaktní hodnocení aplikační relevance judikatury nationalFunds |
size.info | 350 articles |
files.size | 15226360 |
files.count | 2 |
Files in this item
Download all files in item (14.52 MB)This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- corpus.json
- Size
- 13.83 MB
- Format
- Unknown
- Description
- Corpus (gold)
- MD5
- 5de341ef2545591f10b5283ef322386e
- Name
- ReadMe.pdf
- Size
- 712.6 KB
- Format
- Description
- ReadMe
- MD5
- ea46ea2b87576df5daedaecfcaf5e9e7