This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

IDENTICv1.0

Please use the following text to cite this item or export to a predefined format:
Larasati, Septina Dian, 2012, IDENTICv1.0, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11858/00-097C-0000-0005-BF85-F.
Date issued
2012-03-13
Language(s)
Description
IDENTIC is an Indonesian-English parallel corpus for research purposes. The corpus is a bilingual corpus paired with English. The aim of this work is to build and provide researchers a proper Indonesian-English textual data set and also to promote research in this language pair. The corpus contains texts coming from different sources with different genres.
Acknowledgement
This item isPublicly Available
and licensed under:
 Files in this item
Name
IDENTICv1.0.zip
Size
15.85 MB
Format
application/zip
Description
Parallel Corpus
MD5
1d4f2df374b1a04c4616f80b0e158bec
Preview
  File Preview
  • IDENTICv1.0
    • en.npp.conll23 MB
    • identic.noclitic.npp.txt7 MB
    • id.npp.conll34 MB
    • identic.tokenized.npp.txt7 MB
    • identic.raw.npp.txt7 MB