Show simple item record

 
dc.contributor.author Németh, László
dc.contributor.author Halácsy, Péter
dc.contributor.author Kornai, András
dc.contributor.other László, Németh
dc.date.accessioned 2014-07-30T21:34:47Z
dc.date.available 2014-07-30T21:34:47Z
dc.date.issued 2014-07-30
dc.identifier.uri http://hdl.handle.net/11372/LRT-1338
dc.description HunToken is a rule based tokenizer and sentence boundary detector for Hungarian (and English) texts.
dc.publisher Budapest Technical University Media Research Centre
dc.rights GNU Library or "Lesser" General Public License 3.0 (LGPL-3.0)
dc.rights.uri http://opensource.org/licenses/LGPL-3.0
dc.subject tokenizer
dc.title huntoken - tokenizer and sentence splitter
dc.type toolService
dc.rights.label PUB
has.files yes
additional.metadata Documentation language(s) (field_tool_documentation_langua):Hungarian Language(s) of input data (field_tool_input_language):English||Hungarian Implementation language(s) (field_tool_implementation_langu):GNU Flex, C Short name (field_tool_short_name):huntoken Readily Available (field_tool_available):Readily available Availibility (field_tool_availibility):Freely available under LGPL licence Nid:2328 Platform(s) (field_tool_platform):UNIX Character encoding of output data (field_tool_char_encoding_output):Latin 1 (ISO 8859-1)||Latin 2 (ISO 8859-2) Documentation link (field_tool_document_link):http://mokk.bme.hu/resources/huntoken/huntoken.pdf Open source code (field_tool_open_source_code):yes Language(s) of output data (field_tool_output_language):English||Hungarian Character encoding of input data (field_tool_char_encoding):Latin 1 (ISO 8859-1)||Latin 2 (ISO 8859-2)
branding LRT + Open Submissions
dc.coverage.placeName Hungary
files.size 419787
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
GNU Library or "Lesser" General Public License 3.0 (LGPL-3.0)
Icon
Name
huntoken-1.6.tgz
Size
409.95 KB
Format
application/x-gzip
Description
Huntoken
MD5
f2e24178f2ed18bba994c0ec5e2c7fe4
 Download file  Preview
 File Preview  
  • huntoken-1.6
    • hun_sentence.flex5 kB
    • hun_token.flex68 kB
    • LEIRAS1 kB
    • hun_sentclean.flex490 B
    • hun_clean.flex10 kB
    • LICENC17 kB
    • Makefile4 kB
    • CVS
      • Repository18 B
      • Entries571 B
      • Root11 B
    • bin
      • hun_abbrev_en27 kB
      • huntoken1 kB
      • hun_test546 B
      • CVS
        • Repository22 B
        • Entries210 B
        • Root11 B
      • hun_szeged1 kB
      • hun_macro967 B
      • hun_head1 kB
    • hun_latin1.flex6 kB
    • EREDMENY2 kB
    • token.flex++69 kB
    • test
      • CVS
        • Repository23 B
        • Entries2 B
        • Root11 B
    • hun_abbrev_en.flex.m44 kB
    • example
      • 1984.xml779 kB
      • HOLTLELKEK.sbd36 kB
      • HOLTLELKEK.txt35 kB
      • CVS
        • Repository26 B
        • Entries143 B
        • Root11 B
      • HOLTLELKEK.xml119 kB
    • hun_sentbreak.flex712 B
    • data
      • abbrevations.txt1 kB
      • abbrev_en.txt656 B
      • CVS
        • Repository23 B
        • Entries51 B
        • Root11 B
    • man
      • huntoken.11 kB
      • CVS
        • Repository22 B
        • Entries45 B
        • Root11 B
    • hun_abbrev.flex.m44 kB
    • doc
      • huntoken.doc103 kB
      • huntoken.sxw18 kB
      • CVS
        • Repository22 B
        • Entries92 B
        • Root11 B

Show simple item record