Subject: keyword extraction - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Subject keyword extraction

Creator:: Libovický, Jindřich
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: service and toolService
Subject:: keyword extraction
Language:: Czech and English
Description:: KER is a keyword extractor that was designed for scanned texts in Czech and English. It is based on the standard tf-idf algorithm with the idf tables trained on texts from Wikipedia. To deal with the data sparsity, texts are preprocessed by Morphodita: morphological dictionary and tagger.
Rights:: Apache License 2.0, http://opensource.org/licenses/Apache-2.0, and PUB

Creator:: Çano, Erion
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: keyword extraction and supervised keyword generation
Language:: English
Description:: OAGK is a keyword extraction/generation dataset consisting of 2.2 million abstracts, titles and keyword strings from cientific articles. Texts were lowercased and tokenized with Stanford CoreNLP tokenizer. No other preprocessing steps were applied in this release version. Dataset records (samples) are stored as JSON lines in each text file. This data is derived from OAG data collection (https://aminer.org/open-academic-graph) which was released under ODC-BY licence. This data (OAGK Keyword Generation Dataset) is released under CC-BY licence (https://creativecommons.org/licenses/by/4.0/). If using it, please cite the following paper: Çano, Erion and Bojar, Ondřej, 2019, Keyphrase Generation: A Text Summarization Struggle, 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics, June 2019, Minneapolis, USA
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

Creator:: Çano, Erion
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: keyword extraction, supervised keyword generation, and abstractive keyphrasing
Language:: English
Description:: OAGKX is a keyword extraction/generation dataset consisting of 22674436 abstracts, titles and keyword strings from scientific articles. The texts were lowercased and tokenized with Stanford CoreNLP tokenizer. No other preprocessing steps were applied in this release version. Dataset records (samples) are stored as JSON lines in each text file. The data is derived from OAG data collection (https://aminer.org/open-academic-graph) which was released under ODC-BY license. This data (OAGKX Keyword Generation Dataset) is released under CC-BY license (https://creativecommons.org/licenses/by/4.0/). If using it, please cite the following paper: Çano Erion, Bojar Ondřej. Keyphrase Generation: A Multi-Aspect Survey. FRUCT 2019, Proceedings of the 25th Conference of the Open Innovations Association FRUCT, Helsinki, Finland, Nov. 2019 To reproduce the experiments in the above paper, you can use the first 100000 lines of part_0_0.txt file.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

Creator:: Chen, Chien-Hsing and Hsu, Chung-Chian
Format:: bez média and svazek
Type:: model:article and TEXT
Subject:: Clinical diagnosis record, ICD-9 code, keyword extraction, ViSOM, and SOM
Language:: English
Description:: Hospitals must index each case of inpatient medical care with codes from the International Classification of Diseases, 9th Revision (ICD-9), under regulations from the Bureau of National Health Insurance. This paper aims to investigate the analysis of free-textual clinical medical diagnosis documents with ICD-9 codes using state-of-the-art techniques from text and visual mining fields. In this paper, ViSOM and SOM approaches inspire several analyses of clinical diagnosis records with ICD-9 codes. ViSOM and SOM are also used to obtain interesting patterns that have not been discovered with traditional, nonvisual approaches. Furthermore, we addressed three principles that can be used to help clinical doctors analyze diagnosis records effectively using the ViSOM and SOM approaches. The experiments were conducted using real diagnosis records and show that ViSOM and SOM are helpful for organizational decision-making activities.
Rights:: http://creativecommons.org/publicdomain/mark/1.0/ and policy:public

Search