SLäNDa 2.0

Name: SLäNDa 2.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/

Stymne, Sara; Östman, Carin

dc.contributor.author	Stymne, Sara
dc.contributor.author	Östman, Carin
dc.date.accessioned	2022-05-09T12:47:41Z
dc.date.available	2022-05-09T12:47:41Z
dc.date.issued	2022-05-09
dc.identifier.uri	http://hdl.handle.net/11372/LRT-4739
dc.description	SLäNDa, the Swedish literature corpus of narrative and dialogue, is a corpus made up of eight Swedish literary novels from the 19th and early 20th centuries, manually annotated mainly for different aspects of dialogue. The full annotation also contains other cited materials, like thoughts, signs and letters. The main motivation for including these categories as well, is to be able to identify the main narrative, which is all remaining unannotated text. SLäNDa version 2.0 extends version 1.0 mainly by adding more data, but also by additional quality control, and a slight modification of the annotation scheme. In addition, the data is organized into test sets with different types of speech marking: quotation marks, dashes, and no marking.
dc.language.iso	swe
dc.publisher	Uppsala University
dc.relation.replaces	http://hdl.handle.net/11372/LRT-3169
dc.rights	Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subject	literature
dc.subject	literary fiction
dc.subject	dialogue
dc.subject	narrative
dc.subject	cited materials
dc.title	SLäNDa 2.0
dc.type	corpus
metashare.ResourceInfo#ContentInfo.mediaType	text
dc.rights.label	PUB
has.files	yes
branding	LRT + Open Submissions
contact.person	Sara Stymne sara.stymne@lingfil.uu.se Uppsala University
contact.person	Carin Östman carin.ostman@nordiska.uu.se Uppsala University
sponsor	Vetenskapsrådet (Swedish Research Council) 2020-02617 Fictional prose and language change. The role of colloquialization in the history of Swedish 1830–1930 nationalFunds
sponsor	Vetenskapsrådet (Swedish Research Council) 2020-02617 Fictional prose and language change. The role of colloquialization in the history of Swedish 1830–1930} nationalFunds
size.info	255325 tokens
size.info	3287 utterances
files.size	9542776
files.count	1

Soubory tohoto záznamu

Licenční kategorie:

Publicly Available

Licence: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Název: slanda-2.0.zip
Velikost: 9.1 MB
Formát: application/zip
Popis: SläNDa.2.0
MD5: 51509d53feadb901513160876c2307b3

Stáhnout soubor Náhled

Náhled souboru

slanda-2.0
- LREC2022
  - test2-none.orig.conll233 kB
  - test1-none.orig.conll50 kB
  - test1-quote.stripped.conll87 kB
  - test2-dash.orig.conll356 kB
  - train-dev.stripped.conll143 kB
  - test1-dash.orig.conll127 kB
  - test2-none.stripped.conll232 kB
  - train-train.orig.conll1018 kB
  - test1-none.stripped.conll50 kB
  - train-dev.orig.conll146 kB
  - test2-dash.stripped.conll349 kB
  - train-train.stripped.conll993 kB
  - test1-quote.orig.conll90 kB
  - test1-dash.stripped.conll126 kB
- README10 kB
- TSV
  - per-author
    - train.ber5.tsv96 kB
    - train.lev8.tsv99 kB
    - test2d.roo7.tsv341 kB
    - test.sod11.tsv14 kB
    - train.sod29.tsv54 kB
    - test.ber11.tsv260 kB
    - train.hei20.tsv122 kB
    - train.ber29.tsv77 kB
    - test.ben11.tsv320 kB
    - train.ryd11.tsv240 kB
    - train.ryd5.tsv209 kB
    - train.ber17.tsv120 kB
    - test2n.nor32.tsv509 kB
    - train.sod5.tsv38 kB
    - train.ben8.tsv238 kB
    - test2d.fly20.tsv252 kB
    - train.mal18.tsv272 kB
    - train.ber8.tsv193 kB
    - test2d.kru10.tsv110 kB
    - train.hei11.tsv163 kB
    - train.san5.tsv541 kB
    - test2d.kru29.tsv82 kB
    - train.sod8.tsv36 kB
    - test2n.alm5.tsv147 kB
    - test.lev11.tsv81 kB
    - test2n.nor11.tsv74 kB
    - train.lag12.tsv82 kB
    - test2d.kru13.tsv110 kB
    - train.boy15.tsv142 kB
    - train.ben12.tsv472 kB
    - test.str11.tsv174 kB
    - train.ben3.tsv188 kB
    - train.sod20.tsv104 kB
    - train.ber20.tsv33 kB
    - test2d.pal19.tsv252 kB
    - test2n.ced2.tsv60 kB
    - train.str13.tsv204 kB
    - train.boy18.tsv202 kB
    - test.san2.tsv257 kB
    - train.ben15.tsv326 kB
    - train.lev2.tsv130 kB
    - train.ben6.tsv741 kB
    - test2d.roo11.tsv453 kB
    - train.sod23.tsv78 kB
    - test2n.pal16.tsv150 kB
    - train.ber23.tsv87 kB
    - test2d.kru32.tsv71 kB
    - train.boy14.tsv113 kB
    - test2n.ced5.tsv288 kB
    - train.ben2.tsv175 kB
    - train.hei5.tsv135 kB
    - train.lag6.tsv409 kB
    - train.ber2.tsv125 kB
    - test.boy11.tsv213 kB
    - train.ben9.tsv293 kB
    - train.lev5.tsv150 kB
    - train.sod26.tsv39 kB
    - train.ryd20.tsv259 kB
    - test2n.pal19.tsv154 kB
    - train.ber26.tsv130 kB
    - train.sod14.tsv15 kB
    - test2d.elk11.tsv346 kB
    - train.lag2.tsv142 kB
    - train.ber14.tsv175 kB
    - train.ben14.tsv296 kB
    - test2d.kru23.tsv85 kB
    - train.sod2.tsv68 kB
    - train.ben5.tsv430 kB
    - test.lag11.tsv185 kB
  - per-dataset
    - test1-dash.tsv726 kB
    - test2-none.tsv1 MB
    - test2-dash.tsv2 MB
    - test1-none.tsv260 kB
    - test1-quote.tsv521 kB
    - train-all.tsv8 MB
- IOB
  - train-all.iob2 MB
  - test1-dash.iob245 kB
  - test2-none.iob491 kB
  - test2-dash.iob722 kB
  - test1-none.iob84 kB
  - test1-quote.iob177 kB

Zobrazit minimální záznam