PARSEME Corpora v. 1.3 - Licence Agreement

(2020/10/05)

License Terms

The PARSEME corpora annotated for verbal multiword expressions (version 1.3) is a collection of linguistic data and tools. Each of the corpora has its own license terms and you (the “User”) are responsible for complying with the license terms applicable to those parts which you use. If you do not agree with the license terms, you must stop using the corpora and destroy all copies of the data that you have obtained.


The license for every corpus included in the release is specified in the appropriate language directory. The licenses for VMWE annotations (column 11) and morphological/syntactic data (columns 1-10) can be different, which is also indicated in the table below. All files in the bin/ and trial/ folders are licensed under CC BY 4.0.


Overview of the corpora and their license terms

Language Language code VMWEs (column 11) Morphosyntax (columns 1-10)
Arabic AR CC BY 4.0 CC BY-NC-SA 3.0
Bulgarian BG CC BY 4.0 CC BY 4.0
Czech CS CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
German DE CC BY 4.0 CC BY-NC-SA 3.0 US
Greek EL CC BY 4.0 CC BY-NC-SA 4.0 (UD GDT); GNU GPL 3.0 (remainder corpora)
English EN CC BY 4.0 CC BY-SA 4.0 (UD Original, UD LinES); CC BY-SA 3.0 (UD PUD)
Spanish ES CC BY 4.0 CC BY 4.0 (IXA); GNU GPL 3.0 (Ancora), CC BY-NC-SA 3.0 US (UD)
Basque EU CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
Farsi/Persian FA CC BY-NC-SA 4.0 (special license for MULTEX-East, see README) CC BY-NC-SA 4.0 (special license for MULTEX-East, see README)
French FR CC BY 4.0 CC BY-NC-SA 4.0 (UD); LGPL-LR (Sequoia)
Irish GA CC BY 3.0 CC BY-SA 4.0
Hebrew HE CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
Hindi HI CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
Croatian HR CC BY-SA 4.0 CC BY-SA 4.0
Hungarian HU CC BY 4.0 GNU GPL 3.0
Italian IT CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
Lithuanian LT CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
Maltese MT CC BY 4.0 CC BY 4.0
Polish PL CC BY 4.0 GNU GPL v.3 (columns 1-6 with source_sent_id containing NationalCorpusOfPolish), CC BY-NC v.4 (columns 1-6 with source_sent_id containing PolishCoreferenceCorpus), and CC BY-NC-SA 4.0 (columns 1-6 with source_sent_id containing pdb-ud and columns 7-9 for all sentences).
Portuguese PT CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
Romanian RO CC BY 4.0 CC BY 4.0
Slovenian SL CC BY-SA 4.0 CC BY-SA 4.0
Serbian SR CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
Swedish SV CC BY 4.0 CC BY-SA 4.0
Turkish TR CC BY-NC-SA 4.0 CC BY-NC-SA 4.0
Chinese ZH CC BY-NC-SA 4.0 CC BY-NC-SA 4.0

Licenses

License URL
GNU GPL 3.0 http://opensource.org/licenses/GPL-3.0
LGPL-LR http://infolingu.univ-mlv.fr/DonneesLinguistiques/Lexiques-Grammaires/lgpllr.html
CC BY-SA 3.0 http://creativecommons.org/licenses/by-sa/3.0/
CC BY-NC-SA 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/us/
CC BY-NC-SA 3.0 US http://creativecommons.org/licenses/by-nc-sa/3.0/us/
CC BY 4.0 http://creativecommons.org/licenses/by/4.0/
CC BY-SA 4.0 http://creativecommons.org/licenses/by-sa/4.0/
CC BY-NC 4.0 http://creativecommons.org/licenses/by-nc-sa/4.0/
CC BY-NC-SA 4.0 http://creativecommons.org/licenses/by-nc-sa/4.0/