This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

repetitiveness checker

Please use the following text to cite this item or export to a predefined format:
Center for Sprogteknologi, University of Copenhagen, 2014, repetitiveness checker, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11372/LRT-292.
Date issued
2014-07-30
Description
1) Finds repeated sequences of words in documents (repetitiveness checker) 2) Finds common sequences of words in several documents (version comparison) A sequence of words consists of minimally two words. There is no upper limit of the number of words in a sequence, but sequences do not transgress sentence delimiters. There are several weight functions to choose from, each defining "good" sequences in a different way, based on word frequency, sequence lenght and number of repetitions.