Show simple item record

 
dc.contributor.author Aires, João Paulo
dc.date.accessioned 2022-04-20T11:35:13Z
dc.date.available 2022-04-20T11:35:13Z
dc.date.issued 2022-04-11
dc.identifier.uri http://hdl.handle.net/11234/1-4703
dc.description Document-level testsuite for evaluation of gender translation consistency. Our Document-Level test set consists of selected English documents from the WMT21 newstest annotated with gender information. Czech unnanotated references are also added for convenience. We semi-automatically annotated person names and pronouns to identify the gender of these elements as well as coreferences. Our proposed annotation consists of three elements: (1) an ID, (2) an element class, and (3) gender. The ID identifies a person's name and its occurrences (name and pronouns). The element class identifies whether the tag refers to a name or a pronoun. Finally, the gender information defines whether the element is masculine or feminine. We performed a series of NLP techniques to automatically identify person names and coreferences. This initial process resulted in a set containing 45 documents to be manually annotated. Thus, we started a manual annotation of these documents to make sure they are correctly tagged. See README.md for more details.
dc.language.iso eng
dc.language.iso ces
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation info:eu-repo/grantAgreement/EC/H2020/825303
dc.rights Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc/4.0/
dc.subject machine translation
dc.subject testsuite
dc.subject evaluation
dc.subject gender
dc.title Machine Translation Testsuite for Gender-Consistent Translation
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person João Paulo Aires aires@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
sponsor European Union EC/H2020/825303 Bergamot - Browser-based Multilingual Translation euFunds info:eu-repo/grantAgreement/EC/H2020/825303
size.info 172 articles
size.info 2914 sentences
size.info 73018 words
files.size 704177
files.count 3


 Files in this item

 Download all files in item (687.67 KB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Distributed under Creative Commons Attribution Required Noncommercial
Icon
Name
wmt-newstest-2020.zip
Size
429.94 KB
Format
application/zip
Description
Newstest2020
MD5
96a5bbc0fc24ce134051ea3816cdbadd
 Download file  Preview
 File Preview  
  • wmt-newstest-2020
    • english
      • independent.281139.en1 kB
      • sciencedaily.com.75569.en2 kB
      • nytimes.232003.en2 kB
      • guardian.260695.en4 kB
      • euronews-en.185744.en1 kB
      • en.ndtv.com.12997.en3 kB
      • guardian.260697.en1 kB
      • seattle_times.7019.en801 B
      • foxnews.100085.en769 B
      • seattle_times.7657.en1 kB
      • foxnews.100086.en3 kB
      • cbsnews.302129.en2 kB
      • allafrica.15342.en2 kB
      • independent.281332.en3 kB
      • novinite.com.35568.en1 kB
      • rt.com.113876.en2 kB
      • en.ndtv.com.13004.en912 B
      • stv.tv.21636.en2 kB
      • huffingtonpost.com.19368.en839 B
      • rt.com.113877.en933 B
      • thesun.co.uk.27377.en3 kB
      • rt.com.113860.en1 kB
      • cnn.385803.en581 B
      • guardian.260748.en3 kB
      • seattle_times.7755.en808 B
      • dailymail.co.uk.365293.en3 kB
      • metro.co.uk.12158.en2 kB
      • foxnews.100073.en1 kB
      • standard.co.uk.14683.en1014 B
      • reuters.276541.en1 kB
      • foxnews.100074.en1 kB
      • seattle_times.7803.en2 kB
      • reuters.276559.en1 kB
      • reuters.276717.en2 kB
      • upi.205735.en1 kB
      • novinite.com.35557.en1 kB
      • en.ndtv.com.13152.en2 kB
      • en.ndtv.com.13041.en1 kB
      • seattle_times.7059.en3 kB
      • reuters.276702.en2 kB
      • euronews-en.185674.en2 kB
      • seattle_times.7141.en842 B
      • en.ndtv.com.13090.en2 kB
      • standard.co.uk.14688.en1 kB
      • seattle_times.7043.en2 kB
      • upi.205721.en2 kB
      • huffingtonpost.com.19390.en2 kB
      • cnbc.com.33999.en1 kB
      • aj-english.8642.en549 B
      • cnbc.com.33889.en2 kB
      • upi.205660.en1 kB
      • standard.co.uk.14562.en2 kB
      • upi.205724.en2 kB
      • standard.co.uk.14516.en1 kB
      • reuters.276692.en1 kB
      • standard.co.uk.14676.en2 kB
      • huffingtonpost.com.19347.en2 kB
      • reuters.276709.en2 kB
      • cnn.385672.en704 B
      • en.ndtv.com.13143.en2 kB
      • newrepublic.com.11841.en3 kB
      • telegraph.421297.en1 kB
      • nytimes.231903.en1 kB
      • cnn.385674.en1 kB
      • telegraph.421299.en1 kB
      • rte.en.ie.5343.en1 kB
      • upi.205650.en915 B
      • gateway-zh.135.en2 kB
      • huffingtonpost.com.19334.en1 kB
      • rt.com.113906.en1 kB
      • metro.co.uk.12122.en934 B
      • sciencedaily.com.75572.en4 kB
      • seattle_times.7674.en801 B
      • seattle_times.7452.en897 B
      • express.co.uk.10983.en2 kB
      • abcnews.363994.en873 B
      • cbsnews.302258.en1 kB
      • rt.com.113909.en2 kB
      • huffingtonpost.com.19385.en1 kB
      • heraldscotland.com.7318.en3 kB
      • abcnews.364001.en567 B
      • allafrica.15314.en2 kB
      • express.co.uk.11102.en2 kB
      • kcal.279.en2 kB
      • latimes.431856.en2 kB
      • telegraph.421272.en1 kB
      • huffingtonpost.com.19371.en1 kB
      • scotsman.155294.en1 kB
      • rte.en.ie.5333.en1 kB
      • foxnews.100091.en1 kB
      • en.ndtv.com.13073.en2 kB
      • huffingtonpost.com.19389.en911 B
      • latimes.431858.en1 kB
      • chicago_defender.80.en1 kB
      • novinite.com.35573.en1 kB
      • rt.com.113881.en3 kB
      • cnn.385761.en3 kB
      • metro.co.uk.12069.en2 kB
      • sky.com.20667.en1 kB
      • huffingtonpost.com.19376.en3 kB
      • euronews-en.185708.en2 kB
      • cnn.385555.en1 kB
      • dailymail.co.uk.365204.en2 kB
      • guardian.260755.en1 kB
      • sky.com.20669.en1 kB
      • cnn.385556.en4 kB
      • rt.com.113870.en2 kB
      • upi.205695.en1 kB
      • cbsnews.302172.en2 kB
    • czech
      • allafrica.15314.cs3 kB
      • abcnews.364001.cs685 B
      • heraldscotland.com.7318.cs3 kB
      • express.co.uk.11102.cs2 kB
      • kcal.279.cs2 kB
      • latimes.431856.cs2 kB
      • telegraph.421272.cs1 kB
      • huffingtonpost.com.19371.cs2 kB
      • scotsman.155294.cs2 kB
      • rte.en.ie.5333.cs1 kB
      • foxnews.100091.cs1 kB
      • en.ndtv.com.13073.cs3 kB
      • huffingtonpost.com.19389.cs1018 B
      • chicago_defender.80.cs1 kB
      • latimes.431858.cs1 kB
      • novinite.com.35573.cs2 kB
      • rt.com.113881.cs3 kB
      • cnn.385761.cs3 kB
      • metro.co.uk.12069.cs2 kB
      • sky.com.20667.cs1 kB
      • huffingtonpost.com.19376.cs4 kB
      • euronews-en.185708.cs2 kB
      • cnn.385555.cs2 kB
      • dailymail.co.uk.365204.cs2 kB
      • guardian.260755.cs1 kB
      • cnn.385556.cs5 kB
      • sky.com.20669.cs1 kB
      • rt.com.113870.cs3 kB
      • upi.205695.cs1 kB
      • cbsnews.302172.cs3 kB
      • independent.281139.cs1 kB
      • sciencedaily.com.75569.cs2 kB
      • nytimes.232003.cs2 kB
      • guardian.260695.cs4 kB
      • euronews-en.185744.cs1 kB
      • en.ndtv.com.12997.cs4 kB
      • seattle_times.7019.cs890 B
      • guardian.260697.cs1 kB
      • foxnews.100085.cs942 B
      • seattle_times.7657.cs2 kB
      • foxnews.100086.cs3 kB
      • cbsnews.302129.cs2 kB
      • allafrica.15342.cs3 kB
      • independent.281332.cs4 kB
      • novinite.com.35568.cs1 kB
      • rt.com.113876.cs3 kB
      • en.ndtv.com.13004.cs1 kB
      • stv.tv.21636.cs3 kB
      • huffingtonpost.com.19368.cs897 B
      • rt.com.113877.cs960 B
      • thesun.co.uk.27377.cs3 kB
      • rt.com.113860.cs1 kB
      • cnn.385803.cs755 B
      • guardian.260748.cs3 kB
      • seattle_times.7755.cs945 B
      • dailymail.co.uk.365293.cs4 kB
      • metro.co.uk.12158.cs2 kB
      • foxnews.100073.cs1 kB
      • standard.co.uk.14683.cs1 kB
      • reuters.276541.cs1 kB
      • foxnews.100074.cs1 kB
      • seattle_times.7803.cs3 kB
      • reuters.276559.cs2 kB
      • upi.205735.cs2 kB
      • reuters.276717.cs3 kB
      • novinite.com.35557.cs1 kB
      • en.ndtv.com.13152.cs2 kB
      • en.ndtv.com.13041.cs2 kB
      • seattle_times.7059.cs3 kB
      • reuters.276702.cs2 kB
      • euronews-en.185674.cs3 kB
      • seattle_times.7141.cs950 B
      • en.ndtv.com.13090.cs2 kB
      • upi.205721.cs2 kB
      • seattle_times.7043.cs3 kB
      • standard.co.uk.14688.cs1 kB
      • huffingtonpost.com.19390.cs2 kB
      • cnbc.com.33999.cs1 kB
      • aj-english.8642.cs687 B
      • cnbc.com.33889.cs2 kB
      • standard.co.uk.14562.cs3 kB
      • upi.205660.cs1 kB
      • upi.205724.cs3 kB
      • standard.co.uk.14516.cs1 kB
      • reuters.276692.cs1 kB
      • standard.co.uk.14676.cs2 kB
      • huffingtonpost.com.19347.cs2 kB
      • cnn.385672.cs752 B
      • reuters.276709.cs3 kB
      • telegraph.421297.cs1 kB
      • en.ndtv.com.13143.cs2 kB
      • newrepublic.com.11841.cs4 kB
      • nytimes.231903.cs1 kB
      • cnn.385674.cs1 kB
      • telegraph.421299.cs1 kB
      • rte.en.ie.5343.cs1 kB
      • upi.205650.cs1 kB
      • rt.com.113906.cs1 kB
      • huffingtonpost.com.19334.cs1 kB
      • gateway-zh.135.cs3 kB
      • metro.co.uk.12122.cs1 kB
      • sciencedaily.com.75572.cs5 kB
      • seattle_times.7674.cs838 B
      • seattle_times.7452.cs1 kB
      • express.co.uk.10983.cs2 kB
      • abcnews.363994.cs925 B
      • cbsnews.302258.cs1 kB
      • rt.com.113909.cs2 kB
      • huffingtonpost.com.19385.cs2 kB
    • manually_reviewed
      • cnbc.com.33999.annotated1 kB
      • cbsnews.302172.annotated3 kB
      • cnn.385674.annotated1 kB
      • abcnews.364001.annotated664 B
      • cbsnews.302258.annotated1 kB
      • allafrica.15314.annotated3 kB
      • cnn.385556.annotated5 kB
      • chicago_defender.80.annotated2 kB
      • allafrica.15342.annotated3 kB
      • abcnews.363994.annotated1 kB
      • cnn.385555.annotated2 kB
      • cnbc.com.33889.annotated2 kB
      • cnn.385672.annotated841 B
      • cbsnews.302129.annotated3 kB
      • aj-english.8642.annotated604 B
    • not_manually_reviewed
      • gateway-zh.135.annotated3 kB
      • guardian.260695.annotated4 kB
      • en.ndtv.com.12997.annotated3 kB
      • huffingtonpost.com.19334.annotated2 kB
      • kcal.279.annotated2 kB
      • telegraph.421299.annotated1 kB
      • en.ndtv.com.13073.annotated3 kB
      • latimes.431856.annotated2 kB
      • seattle_times.7755.annotated1 kB
      • sky.com.20669.annotated1 kB
      • seattle_times.7043.annotated2 kB
      • euronews-en.185744.annotated1 kB
      • novinite.com.35573.annotated1 kB
      • rte.en.ie.5343.annotated1 kB
      • metro.co.uk.12122.annotated1 kB
      • dailymail.co.uk.365293.annotated4 kB
      • standard.co.uk.14562.annotated3 kB
      • rt.com.113881.annotated4 kB
      • huffingtonpost.com.19389.annotated1 kB
      • foxnews.100085.annotated902 B
      • novinite.com.35557.annotated1 kB
      • telegraph.421297.annotated1 kB
      • heraldscotland.com.7318.annotated3 kB
      • foxnews.100073.annotated1 kB
      • reuters.276541.annotated1 kB
      • reuters.276709.annotated2 kB
      • thesun.co.uk.27377.annotated4 kB
      • metro.co.uk.12158.annotated2 kB
      • huffingtonpost.com.19368.annotated996 B
      • reuters.276692.annotated1 kB
      • en.ndtv.com.13090.annotated2 kB
      • sky.com.20667.annotated1 kB
      • rt.com.113877.annotated1 kB
      • dailymail.co.uk.365204.annotated2 kB
      • guardian.260748.annotated3 kB
      • standard.co.uk.14676.annotated2 kB
      • sciencedaily.com.75569.annotated2 kB
      • upi.205695.annotated2 kB
      • huffingtonpost.com.19347.annotated2 kB
      • rt.com.113860.annotated1 kB
      • upi.205735.annotated2 kB
      • seattle_times.7657.annotated1 kB
      • upi.205650.annotated1 kB
      • standard.co.uk.14688.annotated2 kB
      • en.ndtv.com.13041.annotated2 kB
      • euronews-en.185674.annotated3 kB
      • standard.co.uk.14683.annotated1 kB
      • guardian.260755.annotated1 kB
      • newrepublic.com.11841.annotated3 kB
      • upi.205721.annotated2 kB
      • metro.co.uk.12069.annotated2 kB
      • rt.com.113870.annotated2 kB
      • en.ndtv.com.13152.annotated2 kB
      • foxnews.100074.annotated1 kB
      • seattle_times.7803.annotated3 kB
      • reuters.276702.annotated2 kB
      • scotsman.155294.annotated2 kB
      • foxnews.100086.annotated3 kB
      • standard.co.uk.14516.annotated1 kB
      • nytimes.231903.annotated1 kB
      • cnn.385803.annotated596 B
      • huffingtonpost.com.19385.annotated2 kB
      • seattle_times.7452.annotated910 B
      • reuters.276559.annotated1 kB
      • upi.205660.annotated1 kB
      • rt.com.113906.annotated1 kB
      • en.ndtv.com.13143.annotated2 kB
      • huffingtonpost.com.19376.annotated4 kB
      • nytimes.232003.annotated2 kB
      • en.ndtv.com.13004.annotated1 kB
      • independent.281139.annotated2 kB
      • express.co.uk.10983.annotated2 kB
      • seattle_times.7059.annotated3 kB
      • independent.281332.annotated4 kB
      • telegraph.421272.annotated1 kB
      • huffingtonpost.com.19371.annotated2 kB
      • upi.205724.annotated3 kB
      • seattle_times.7019.annotated1 kB
      • seattle_times.7674.annotated1 kB
      • reuters.276717.annotated2 kB
      • guardian.260697.annotated1 kB
      • latimes.431858.annotated1 kB
      • euronews-en.185708.annotated2 kB
      • seattle_times.7141.annotated953 B
      • stv.tv.21636.annotated2 kB
      • huffingtonpost.com.19390.annotated2 kB
      • sciencedaily.com.75572.annotated4 kB
      • express.co.uk.11102.annotated2 kB
      • novinite.com.35568.annotated1 kB
      • rt.com.113876.annotated2 kB
      • foxnews.100091.annotated1 kB
      • cnn.385761.annotated3 kB
      • rte.en.ie.5333.annotated1 kB
      • rt.com.113909.annotated2 kB
Icon
Name
wmt-newstest-2021.zip
Size
255.96 KB
Format
application/zip
Description
Newstest2021
MD5
1ca301d5d5480a9b5a8a36ed1d92859e
 Download file  Preview
 File Preview  
  • wmt-newstest-2021
    • english
      • en.ndtv.com.75109.en-en.ndtv.com.75143.doc.cs.aligned497 B
      • en.ndtv.com.75109.en-en.ndtv.com.75227.cs.aligned732 B
      • voa-en.100681.en1 kB
      • seattle_times.253486.en-seattle_times.253342.sent.cs.aligned601 B
      • sky.com.33889.en-sky.com.33889.doc.cs.aligned1 kB
      • allafrica.53572.en-allafrica.53568.cs.aligned770 B
      • seattle_times.253486.en1 kB
      • en.ndtv.com.75227.en-en.ndtv.com.75227.cs.aligned745 B
      • bbc.500856.en-bbc.500856.cs.aligned1 kB
      • allafrica.53572.en-allafrica.53572.cs.aligned1 kB
      • express.co.uk.44069.en-express.co.uk.44069.cs.aligned872 B
      • rte.en.ie.30771.en-rte.en.ie.30771.doc.cs.aligned-rte.en.ie.30771.sent.cs.aligned906 B
      • allafrica.53572.en2 kB
      • seattle_times.253342.en-seattle_times.253342.cs.aligned1 kB
      • bbc.500882.en1 kB
      • foxnews.106543.en-foxnews.106543.cs.aligned1 kB
      • express.co.uk.44121.en1 kB
      • rt.com.131269.en-rt.com.131269.cs.aligned1 kB
      • standard.co.uk.57768.en-standard.co.uk.57768.cs.aligned1 kB
      • nytimes.268754.en-nytimes.268754.cs.aligned-nytimes.268754.cs.aligned1 kB
      • thesun.co.uk.114364.en1 kB
      • bbc.500855.en-bbc.500855.cs.aligned1 kB
      • standard.co.uk.57834.en-standard.co.uk.57834.sent.cs.aligned2 kB
      • abcnews.420117.en1 kB
      • allafrica.53572.en-allafrica.53572.cs.aligned-allafrica.53568.sent.cs.aligned680 B
      • express.co.uk.44141.en-express.co.uk.44141.cs.aligned344 B
      • rte.en.ie.30771.en-rte.en.ie.30771.cs.aligned1 kB
      • voa-en.100681.en-voa-en.100681.cs.aligned-voa-en.100681.doc.cs.aligned1 kB
      • en.ndtv.com.75106.en-en.ndtv.com.75106.cs.aligned1 kB
      • aj-english.22076.en-aj-english.22076.cs.aligned-aj-english.22076.cs.aligned-aj-english.22076.doc.cs.aligned1 kB
      • brisbanetimes.com.au.275198.en-brisbanetimes.com.au.275198.sent.cs.aligned2 kB
      • standard.co.uk.57834.en-standard.co.uk.57768.doc.cs.aligned1 kB
      • bbc.500856.en-bbc.500856.cs.aligned-bbc.500855.sent.cs.aligned1 kB
      • independent.597197.en-independent.597197.doc.cs.aligned1 kB
      • abcnews.420122.en-abcnews.420122.cs.aligned565 B
      • standard.co.uk.57768.en1 kB
      • dailymail.co.uk.432355.en-dailymail.co.uk.432355.cs.aligned1 kB
      • abcnews.420140.en-abcnews.420140.cs.aligned2 kB
      • egyptindependent.com.6424.en-egyptindependent.com.6424.sent.cs.aligned1017 B
      • abcnews.420117.en-abcnews.420117.cs.aligned978 B
      • independent.597197.en-independent.597197.doc.cs.aligned-independent.597197.sent.cs.aligned1 kB
      • abcnews.420140.en-abcnews.420140.sent.cs.aligned2 kB
      • bbc.500823.en5 kB
      • foxnews.106543.en-foxnews.106543.cs.aligned-foxnews.106537.doc.cs.aligned1 kB
      • aj-english.22076.en-aj-english.22076.cs.aligned-aj-english.22076.cs.aligned-aj-english.22076.sent.cs.aligned1 kB
      • voa-en.100681.en-voa-en.100681.cs.aligned-voa-en.100681.sent.cs.aligned973 B
      • rt.com.131224.en3 kB
      • metro.co.uk.47971.en-metro.co.uk.47971.sent.cs.aligned1 kB
      • seattle_times.253356.en-seattle_times.253356.cs.aligned1 kB
      • aj-english.22076.en-aj-english.22076.cs.aligned-aj-english.22076.cs.aligned1 kB
      • metro.co.uk.47971.en1 kB
      • thesun.co.uk.114364.en-thesun.co.uk.114364.doc.cs.aligned1 kB
      • abcnews.420184.en-abcnews.420184.cs.aligned1 kB
      • dailymail.co.uk.432355.en-dailymail.co.uk.432355.doc.cs.aligned1 kB
      • seattle_times.253486.en-seattle_times.253486.cs.aligned1014 B
      • cbsnews.377408.en-cbsnews.377408.cs.aligned1 kB
      • rte.en.ie.30771.en1 kB
      • cbsnews.377408.en2 kB
      • standard.co.uk.57834.en-standard.co.uk.57834.cs.aligned2 kB
      • express.co.uk.44090.en-express.co.uk.44090.cs.aligned-express.co.uk.44141.sent.cs.aligned326 B
      • cbsnews.377408.en-cbsnews.377538.doc.cs.aligned2 kB
      • bbc.500823.en-bbc.500823.cs.aligned3 kB
      • foxnews.106543.en-foxnews.106543.cs.aligned-foxnews.106537.sent.cs.aligned1 kB
      • abcnews.420140.en4 kB
      • seattle_times.253356.en1 kB
      • express.co.uk.44069.en1 kB
      • egyptindependent.com.6424.en-egyptindependent.com.6424.doc.cs.aligned989 B
      • rt.com.131229.en1 kB
      • allafrica.53568.en1 kB
      • en.ndtv.com.75114.en2 kB
      • nytimes.268754.en3 kB
      • rt.com.131229.en-rt.com.131229.cs.aligned1 kB
      • rt.com.131279.en1 kB
      • metro.co.uk.47971.en-metro.co.uk.47971.doc.cs.aligned1 kB
      • seattle_times.253342.en1 kB
      • dailymail.co.uk.432355.en-dailymail.co.uk.432355.sent.cs.aligned1 kB
      • foxnews.106537.en-foxnews.106537.cs.aligned1 kB
      • foxnews.106543.en2 kB
      • en.ndtv.com.75227.en1 kB
      • cbsnews.377408.en-cbsnews.377408.sent.cs.aligned1 kB
      • allafrica.53572.en-allafrica.53572.cs.aligned-allafrica.53572.doc.cs.aligned1 kB
      • independent.597197.en1 kB
      • nytimes.268754.en-nytimes.268754.cs.aligned-nytimes.268754.cs.aligned-nytimes.268790.doc.cs.aligned2 kB
      • rt.com.131269.en-rt.com.131269.cs.aligned-rt.com.131269.sent.cs.aligned1 kB
      • aj-english.22076.en2 kB
      • aj-english.22076.en-aj-english.22076.cs.aligned1 kB
      • allafrica.53568.en-allafrica.53568.cs.aligned796 B
      • rt.com.131269.en-rt.com.131269.cs.aligned-rt.com.131224.doc.cs.aligned1 kB
      • nytimes.268790.en3 kB
      • bbc.500882.en-bbc.500882.cs.aligned882 B
      • express.co.uk.44090.en1 kB
      • express.co.uk.44090.en-express.co.uk.44090.cs.aligned-express.co.uk.44090.cs.aligned1 kB
      • egyptindependent.com.6424.en1 kB
      • metro.co.uk.47971.en-metro.co.uk.47971.cs.aligned1 kB
      • dailymail.co.uk.432355.en2 kB
      • cbsnews.377538.en-cbsnews.377538.cs.aligned2 kB
      • rt.com.131269.en1 kB
      • brisbanetimes.com.au.275198.en-brisbanetimes.com.au.275198.doc.cs.aligned2 kB
      • express.co.uk.44121.en-express.co.uk.44121.cs.aligned997 B
      • voa-en.100681.en-voa-en.100681.cs.aligned1 kB
      • en.ndtv.com.75106.en2 kB
      • independent.597197.en-independent.597197.cs.aligned1 kB
      • express.co.uk.44141.en432 B
      • foxnews.106543.en-foxnews.106543.cs.aligned-foxnews.106543.cs.aligned1 kB
      • bbc.500855.en1 kB
      • en.ndtv.com.75109.en-en.ndtv.com.75109.cs.aligned1 kB
      • abcnews.420184.en1 kB
      • nytimes.268754.en-nytimes.268754.cs.aligned1 kB
      • en.ndtv.com.75108.en2 kB
      • seattle_times.253486.en-seattle_times.253486.doc.cs.aligned970 B
      • bbc.500856.en1 kB
      • bbc.500856.en-bbc.500856.cs.aligned-bbc.500856.doc.cs.aligned941 B
      • sky.com.33889.en2 kB
      • bbc.500856.en-bbc.500855.cs.aligned1 kB
      • abcnews.420140.en-abcnews.420122.doc.cs.aligned541 B
      • rt.com.131279.en-rt.com.131279.cs.aligned1 kB
      • nytimes.268790.en-nytimes.268790.cs.aligned1 kB
      • en.ndtv.com.75109.en2 kB
      • foxnews.106537.en2 kB
      • en.ndtv.com.75114.en-en.ndtv.com.75114.cs.aligned1 kB
      • thesun.co.uk.114364.en-thesun.co.uk.114364.cs.aligned1 kB
      • voa-en.100681.en-voa-en.100681.cs.aligned-voa-en.100681.cs.aligned1 kB
      • cbsnews.377538.en3 kB
      • abcnews.420122.en836 B
      • standard.co.uk.57834.en3 kB
      • en.ndtv.com.75109.en-en.ndtv.com.75106.sent.cs.aligned1 kB
      • rt.com.131224.en-rt.com.131224.cs.aligned2 kB
      • thesun.co.uk.114364.en-thesun.co.uk.114364.sent.cs.aligned1 kB
      • express.co.uk.44090.en-express.co.uk.44090.cs.aligned1 kB
      • en.ndtv.com.75143.en-en.ndtv.com.75143.cs.aligned552 B
      • en.ndtv.com.75143.en795 B
      • egyptindependent.com.6424.en-egyptindependent.com.6424.cs.aligned949 B
      • sky.com.33889.en-sky.com.33889.sent.cs.aligned1 kB
      • express.co.uk.44090.en-express.co.uk.44090.cs.aligned-express.co.uk.44121.doc.cs.aligned958 B
      • en.ndtv.com.75108.en-en.ndtv.com.75108.cs.aligned1 kB
      • nytimes.268754.en-nytimes.268754.cs.aligned-nytimes.268754.cs.aligned-nytimes.268754.sent.cs.aligned1 kB
      • sky.com.33889.en-sky.com.33889.cs.aligned1 kB
      • brisbanetimes.com.au.275198.en2 kB
      • brisbanetimes.com.au.275198.en-brisbanetimes.com.au.275198.cs.aligned1 kB
      • rte.en.ie.30771.en-rte.en.ie.30771.doc.cs.aligned1 kB
    • czech
      • rt.com.131224.cs4 kB
      • bbc.500855.cs1 kB
      • bbc.500823.cs6 kB
      • rt.com.131229.cs2 kB
      • en.ndtv.com.75143.cs922 B
      • aj-english.22076.cs2 kB
      • voa-en.100681.cs1 kB
      • nytimes.268754.cs3 kB
      • metro.co.uk.47971.cs1 kB
      • cbsnews.377538.cs4 kB
      • abcnews.420140.cs4 kB
      • en.ndtv.com.75108.cs2 kB
      • express.co.uk.44121.cs1 kB
      • rte.en.ie.30771.cs1 kB
      • foxnews.106537.cs2 kB
      • standard.co.uk.57834.cs3 kB
      • en.ndtv.com.75114.cs2 kB
      • foxnews.106543.cs3 kB
      • express.co.uk.44090.cs2 kB
      • en.ndtv.com.75106.cs2 kB
      • nytimes.268790.cs4 kB
      • bbc.500856.cs1 kB
      • cbsnews.377408.cs2 kB
      • brisbanetimes.com.au.275198.cs3 kB
      • dailymail.co.uk.432355.cs3 kB
      • seattle_times.253342.cs1 kB
      • abcnews.420122.cs843 B
      • allafrica.53568.cs1 kB
      • standard.co.uk.57768.cs2 kB
      • seattle_times.253486.cs1 kB
      • en.ndtv.com.75109.cs2 kB
      • express.co.uk.44141.cs524 B
      • rt.com.131279.cs1 kB
      • sky.com.33889.cs3 kB
      • abcnews.420184.cs2 kB
      • en.ndtv.com.75227.cs1 kB
      • allafrica.53572.cs2 kB
      • abcnews.420117.cs1 kB
      • seattle_times.253356.cs1 kB
      • express.co.uk.44069.cs1 kB
      • egyptindependent.com.6424.cs1 kB
      • independent.597197.cs1 kB
      • rt.com.131269.cs2 kB
      • bbc.500882.cs1 kB
      • thesun.co.uk.114364.cs1 kB
    • manually_reviewed
      • metro.co.uk.47971.annotated2 kB
      • foxnews.106537.annotated2 kB
      • rte.en.ie.30771.annotated1 kB
      • rt.com.131279.annotated1 kB
      • express.co.uk.44121.annotated1 kB
      • express.co.uk.44141.annotated711 B
      • seattle_times.253356.annotated1 kB
      • en.ndtv.com.75114.annotated2 kB
      • rt.com.131224.annotated4 kB
      • thesun.co.uk.114364.annotated2 kB
      • abcnews.420117.annotated1 kB
      • cbsnews.377538.annotated4 kB
      • bbc.500855.annotated1 kB
      • allafrica.53572.annotated3 kB
      • allafrica.53568.annotated1 kB
      • rt.com.131229.annotated2 kB
      • standard.co.uk.57834.annotated3 kB
      • en.ndtv.com.75143.annotated1 kB
      • en.ndtv.com.75108.annotated2 kB
      • brisbanetimes.com.au.275198.annotated3 kB
      • nytimes.268790.annotated4 kB
      • seattle_times.253486.annotated1 kB
      • en.ndtv.com.75106.annotated3 kB
      • bbc.500882.annotated1 kB
      • rt.com.131269.annotated2 kB
      • abcnews.420122.annotated1018 B
      • egyptindependent.com.6424.annotated1 kB
      • sky.com.33889.annotated3 kB
      • bbc.500856.annotated1 kB
      • dailymail.co.uk.432355.annotated3 kB
      • abcnews.420184.annotated2 kB
      • express.co.uk.44090.annotated2 kB
      • independent.597197.annotated2 kB
      • en.ndtv.com.75227.annotated1 kB
      • bbc.500823.annotated5 kB
      • abcnews.420140.annotated4 kB
      • en.ndtv.com.75109.annotated2 kB
      • voa-en.100681.annotated1 kB
      • foxnews.106543.annotated3 kB
      • seattle_times.253342.annotated1 kB
      • cbsnews.377408.annotated2 kB
      • nytimes.268754.annotated3 kB
      • aj-english.22076.annotated2 kB
      • standard.co.uk.57768.annotated2 kB
      • express.co.uk.44069.annotated1 kB
Icon
Name
README.md
Size
1.77 KB
Format
Unknown
Description
Dataset Description
MD5
e8add2600a46c1dfc25acb8fe08fc91b
 Download file

Show simple item record