Show simple item record Larasati, Septina Dian 2012-03-13T14:34:36Z 2012-03-13T14:34:36Z 2012-03-13
dc.description IDENTIC is an Indonesian-English parallel corpus for research purposes. The corpus is a bilingual corpus paired with English. The aim of this work is to build and provide researchers a proper Indonesian-English textual data set and also to promote research in this language pair. The corpus contains texts coming from different sources with different genres.
dc.description.sponsorship The research leading to these results has received funding from the European Commission’s 7th Framework Program under grant agreement no 238405 (CLARA) and by the grant LC536 Centrum Komputacni Lingvistiky of the Czech Ministry of Education.
dc.language.iso ind
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation info:eu-repo/grantAgreement/EC/FP7/238405
dc.rights Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
dc.subject Indonesian-English parallel corpus
dc.subject parallel corpus
dc.title IDENTICv1.0
dc.type corpus
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIN
sponsor European Union FP7-238405 CLARA (Common Language Resources and their Applications) euFunds info:eu-repo/grantAgreement/EC/FP7/238405
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LC536 Centrum komputační lingvistiky nationalFunds
files.size 16615187
files.count 1

 Files in this item

This item is
Publicly Available
and licensed under:
Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Distributed under Creative Commons Attribution Required Noncommercial Share Alike
15.85 MB
Parallel Corpus
 Download file  Preview
 File Preview  
  • IDENTICv1.0
    • en.npp.conll23 MB
    • identic.noclitic.npp.txt7 MB
    • id.npp.conll34 MB
    • identic.tokenized.npp.txt7 MB
    • identic.raw.npp.txt7 MB

Show simple item record