Feature-based tagger
Please use the following text to cite this item or export to a predefined format:
Hajič, Jan, 2009,
Feature-based tagger, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11858/00-097C-0000-0001-4904-2.
Authors
Item identifier
Date issued
2009-11-02T09:22:59Z
Type
Description
The Feature-based (exponential model) Tagger is a fast implementation of the Czech tagger developed at UFAL and described in the PDT 1.0 documentation (Czech Language Tagging page). In order to get the best possible results, the tagger requires preprocessing by a Czech morphological module with a very high coverage. This module covers a superset of the Czech "FM" morphology. Both the morphological module and the tagger are supplied as binary executables, together with all necessary precompiled Czech data. Input must be in the ISO Latin 2 (iso-8859-2) code and follow the csts.dtd definition, and output is produced in the same way (ISO Latin 2 code, csts.dtd). (As is the case with many of the tools provided with PDT 1.0, both executables also accept - and then produce - a "simplified SGML", which is not a real, valid SGML, but simply contains at least the tags for words, punctuation, and sentence breaks, one item per line.)
Subject(s)
Collections
Files in this item
- Name
- CZ010619x.tgz
- Size
- 2.04 MB
- Format
- application/x-gzip
- Description
- PDT 1.0 version - Linux
- MD5
- 0c0b1bafc4080ea9edc15873dd7791db

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- CZ130120ax.tgz
- Size
- 2.43 MB
- Format
- application/x-gzip
- Description
- Newer version, linux
- MD5
- 94f0873a87129e64be41c14356cd0948

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- CZ010619xs.tgz
- Size
- 2.06 MB
- Format
- application/x-gzip
- Description
- PDT 1.0 version - Solaris
- MD5
- c2b38cc100a6981ea08bfd830c90e319

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

