Show simple item record

 
dc.contributor.author Larasati, Septina Dian
dc.date.accessioned 2012-03-13T14:34:36Z
dc.date.available 2012-03-13T14:34:36Z
dc.date.issued 2012-03-13
dc.identifier.uri http://hdl.handle.net/11858/00-097C-0000-0005-BF85-F
dc.description IDENTIC is an Indonesian-English parallel corpus for research purposes. The corpus is a bilingual corpus paired with English. The aim of this work is to build and provide researchers a proper Indonesian-English textual data set and also to promote research in this language pair. The corpus contains texts coming from different sources with different genres.
dc.description.sponsorship The research leading to these results has received funding from the European Commission’s 7th Framework Program under grant agreement no 238405 (CLARA) and by the grant LC536 Centrum Komputacni Lingvistiky of the Czech Ministry of Education.
dc.language.iso ind
dc.language.iso eng
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation info:eu-repo/grantAgreement/EC/FP7/238405
dc.rights Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/
dc.subject Indonesian-English parallel corpus
dc.subject parallel corpus
dc.title IDENTICv1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Septina Dian Larasati septina.larasati@gmail.com Charles University in Prague, UFAL
sponsor European Union FP7-238405 CLARA (Common Language Resources and their Applications) euFunds info:eu-repo/grantAgreement/EC/FP7/238405
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LC536 Centrum komputační lingvistiky nationalFunds
files.size 16615187
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Distributed under Creative Commons Attribution Required Noncommercial Share Alike
Icon
Name
IDENTICv1.0.zip
Size
15.85 MB
Format
application/zip
Description
Parallel Corpus
MD5
1d4f2df374b1a04c4616f80b0e158bec
 Download file  Preview
 File Preview  
  • IDENTICv1.0
    • en.npp.conll23 MB
    • identic.noclitic.npp.txt7 MB
    • id.npp.conll34 MB
    • identic.tokenized.npp.txt7 MB
    • identic.raw.npp.txt7 MB

Show simple item record