SELEXINI corpus
Please use the following text to cite this item or export to a predefined format:
Scholivet, Manon; Savary, Agata; Estève, Louis Clément; Candito, Marie and Ramisch, Carlos, 2024,
SELEXINI corpus, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11372/LRT-5822.
Authors
Item identifier
Project URL
Date issued
2024-12-12
Type
Size
3000000000 tokens
Language(s)
Description
We present here a large automatically annotated corpus for French. This corpus is divided into two parts: the first from BigScience, and the second from HPLT. The annotated documents from HPLT were selected in order to optimise the lexical diversity of the final corpus SELEXINI.
Subject(s)
Collections
Files in this item
- Name
- bigscience_FTB-dep.tar.gz
- Size
- 10.69 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 52e499d41a83ca820585efb888b5524a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- bigscience_UD.tar.gz
- Size
- 15.03 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 11dae92039406f4190045609a233aad6

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- hplt_FTB-dep.tar.gz
- Size
- 11.68 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 547801aae560d9cfdbde76a692fb7543

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- hplt_UD.tar.gz
- Size
- 16.36 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1ba2fa0de3166f404b2bd0aa8a21b857

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

