The electronic version of the book “Corpus PAAU 1992: Descriptive Studies, Texts and Vocabulary” includes the texts that have been object of analysis in this project as well as the vocabulary lists that make up the Corpus 92.
domain specific corpus (Law, Economy, Computing, Medicine and Environment as well as a contrastive corpus from the press); EN 3.3 M tokens, SP 33 M tokens, CAT 19 M tokens; EAGLEs pos tagset
"A scholarly edition of Aquinas's Opera omnia, with a lexical database, a dictionary, two collection of historical sources, and an extensive bibliography."
This SOAP service implements the IMS Open Corpus Workbench (CWB), a collection of open-source tools for managing and querying large text corpora (ranging from 10 million to 2 billion words) with linguistic annotations. Its central component is the flexible and efficient query processor CQP. The service makes it possible to index a new corpus and query it.
Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK).
Source code available at https://github.com/notesjor/corpusexplorer2.0