Files in this item

This item is
Publicly Available
and licensed under:
BSD 3-Clause "New" or "Revised" license
BSD Attribution Required
Icon
Name
chared-1.2.1.tar.gz
Size
23.04 MB
Format
application/x-gzip
Description
chared is a tool for detecting the character encoding of a text in a known language. The language of the text has to be specified as an input parameter so that correspondent language model can be used. The package contains models for a wide range of languages. In general, it should be more accurate than character encoding detection algorithms with no language constraints.
MD5
12d042d631b0b30c9540ff9e1a5b7a13
 Download file  Preview
 File Preview  
  • chared-1.2.1
    • CHANGES572 B
    • chared
      • util
        • encoding.py1 kB
        • html2txt.py2 kB
        • __init__.py0 B
        • trigrams.py6 kB
      • detector.py7 kB
      • __init__.py378 B
      • models
        • czech.edm883 kB
        • japanese.edm4 MB
        • malayalam.edm133 kB
        • catalan.edm499 kB
        • arabic.edm864 kB
        • korean.edm19 B
        • serbian.edm1 MB
        • slovak.edm1 MB
        • thai.edm1 MB
        • hindi.edm203 kB
        • persian.edm794 kB
        • romanian.edm691 kB
        • dutch.edm323 kB
        • norwegian_bokmal.edm362 kB
        • lithuanian.edm383 kB
        • chinese_traditional.edm12 MB
        • bulgarian.edm964 kB
        • icelandic.edm521 kB
        • welsh.edm256 kB
        • croatian.edm591 kB
        • maltese.edm235 kB
        • chinese_simplified.edm2 MB
        • hungarian.edm976 kB
        • irish.edm399 kB
        • estonian.edm778 kB
        • portuguese.edm499 kB
        • greek.edm1 MB
        • tamil.edm144 kB
        • bengali.edm191 kB
        • italian.edm243 kB
        • swedish.edm426 kB
        • finnish.edm355 kB
        • telugu.edm194 kB
        • urdu.edm1 MB
        • turkish.edm845 kB
        • indonesian.edm154 kB
        • ukrainian.edm1 MB
        • hebrew.edm749 kB
        • polish.edm521 kB
        • latvian.edm472 kB
        • gujarati.edm232 kB
        • malay.edm146 kB
        • russian.edm1 MB
        • vietnamese.edm531 kB
        • english.edm105 kB
        • armenian.edm448 kB
        • slovene.edm473 kB
        • french.edm353 kB
        • german.edm424 kB
        • spanish.edm363 kB
    • MANIFEST.in98 B
    • bin
      • chared3 kB
      • chared-learn13 kB
    • setup.py1 kB
    • README277 B
    • COPYING1 kB
    • PKG-INFO720 B