dc.contributor.author | Larasati, Septina Dian |
dc.date.accessioned | 2012-03-13T14:34:36Z |
dc.date.available | 2012-03-13T14:34:36Z |
dc.date.issued | 2012-03-13 |
dc.identifier.uri | http://hdl.handle.net/11858/00-097C-0000-0005-BF85-F |
dc.description | IDENTIC is an Indonesian-English parallel corpus for research purposes. The corpus is a bilingual corpus paired with English. The aim of this work is to build and provide researchers a proper Indonesian-English textual data set and also to promote research in this language pair. The corpus contains texts coming from different sources with different genres. |
dc.description.sponsorship | The research leading to these results has received funding from the European Commission’s 7th Framework Program under grant agreement no 238405 (CLARA) and by the grant LC536 Centrum Komputacni Lingvistiky of the Czech Ministry of Education. |
dc.language.iso | ind |
dc.language.iso | eng |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.relation | info:eu-repo/grantAgreement/EC/FP7/238405 |
dc.rights | Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/3.0/ |
dc.subject | Indonesian-English parallel corpus |
dc.subject | parallel corpus |
dc.title | IDENTICv1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Septina Dian Larasati septina.larasati@gmail.com Charles University in Prague, UFAL |
sponsor | European Union FP7-238405 CLARA (Common Language Resources and their Applications) euFunds info:eu-repo/grantAgreement/EC/FP7/238405 |
sponsor | Ministerstvo školství, mládeže a tělovýchovy České republiky LC536 Centrum komputační lingvistiky nationalFunds |
files.size | 16615187 |
files.count | 1 |
Soubory tohoto záznamu
Licenční kategorie:
Licence: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Publicly Available
Licence: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
- Název
- IDENTICv1.0.zip
- Velikost
- 15.85 MB
- Formát
- application/zip
- Popis
- Parallel Corpus
- MD5
- 1d4f2df374b1a04c4616f80b0e158bec
- IDENTICv1.0
- en.npp.conll23 MB
- identic.noclitic.npp.txt7 MB
- id.npp.conll34 MB
- identic.tokenized.npp.txt7 MB
- identic.raw.npp.txt7 MB