This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

Czech SubLex 1.0

Please use the following text to cite this item or export to a predefined format:
Veselovská, Kateřina and Bojar, Ondřej, 2013, Czech SubLex 1.0, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11858/00-097C-0000-0022-FF60-B.
Date issued
2013-12-02
Size
207 kb
Language(s)
Description
Czech subjectivity lexicon, i.e. a list of subjectivity clues for sentiment analysis in Czech. The list contains 4626 evaluative items (1672 positive and 2954 negative) together with their part of speech tags, polarity orientation and source information. The core of the Czech subjectivity lexicon has been gained by automatic translation of a freely available English subjectivity lexicon downloaded from http://www.cs.pitt.edu/mpqa/subj_lexicon.html. For translating the data into Czech, we used parallel corpus CzEng 1.0 containing 15 million parallel sentences (233 million English and 206 million Czech tokens) from seven different types of sources automatically annotated at surface and deep layers of syntactic representation. Afterwards, the lexicon has been manually refined by an experienced annotator.
Acknowledgement
This item isPublicly Available
and licensed under:
 Files in this item
Name
Czech subjectivity lexicon.pdf
Size
165.89 KB
Format
application/pdf
Description
Czech SubLex article
MD5
946d3ea2a7bbcc0c5c2261c7286f2031
Preview
  File Preview
Name
sublex_1_0.csv
Size
206.26 KB
Format
application/octet-stream
Description
Czech SubLex 1.0
MD5
d2655e8b46d2a192f094d492c8d4ffdb
Preview
  File Preview
Name
README
Size
745 B
Format
application/octet-stream
Description
CSV README
MD5
ad9f39db98ec16c0bc6e3ad31040d657
Preview
  File Preview