This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

KER - Keyword Extractor

Please use the following text to cite this item or export to a predefined format:
Libovický, Jindřich, 2016, KER - Keyword Extractor, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-1650.
Date issued
2016-02-22
Language(s)
Description
KER is a keyword extractor that was designed for scanned texts in Czech and English. It is based on the standard tf-idf algorithm with the idf tables trained on texts from Wikipedia. To deal with the data sparsity, texts are preprocessed by Morphodita: morphological dictionary and tagger.
This item isPublicly Available
and licensed under:
 Files in this item
Name
ker-1.0.0.tar.gz
Size
10.51 KB
Format
application/x-gzip
Description
Archive with the release sources
MD5
113db6ff955a1c5cb43f33ac7e3d62bf
Preview
  File Preview
  • ker-1.0.0
    • README.md49 B
    • .gitignore67 B
    • .gitmodules0 B
    • prepare_venv.sh137 B
    • prepare_idf_table.py2 kB
    • keywords.py5 kB
    • server.py8 kB
    • LICENSE7 kB
    • web.html8 kB
Name
cs_idf_table.pickle
Size
22.16 MB
Format
application/octet-stream
Description
IDF model for Czech
MD5
07ada26258f3f5be28ef82b41c7324e0
Preview
  File Preview
Name
en_idf_table.pickle
Size
3.82 MB
Format
application/octet-stream
Description
IDF model for English
MD5
cfd4ba647032a22c35d1fda736046e00
Preview
  File Preview