repetitiveness checker
- Autoři
- Jongejan, Bart
- Identifikátor
- http://hdl.handle.net/11372/LRT-292
- URL projektu
- http://cst.dk/tools/
- Datum vydání
- 2014-07-30
- Typ
- toolService
- Popis
- 1) Finds repeated sequences of words in documents (repetitiveness checker) 2) Finds common sequences of words in several documents (version comparison) A sequence of words consists of minimally two words. There is no upper limit of the number of words in a sequence, but sequences do not transgress sentence delimiters. There are several weight functions to choose from, each defining "good" sequences in a different way, based on word frequency, sequence lenght and number of repetitions.