Feature-based tagger

Hajič, Jan

dc.contributor.author	Hajič, Jan
dc.date.accessioned	2011-06-28T09:42:24Z
dc.date.available	2009-11-02T09:22:59Z
dc.date.issued	2009-11-02T09:22:59Z
dc.identifier.uri	http://hdl.handle.net/11858/00-097C-0000-0001-4904-2
dc.description	The Feature-based (exponential model) Tagger is a fast implementation of the Czech tagger developed at UFAL and described in the PDT 1.0 documentation (Czech Language Tagging page). In order to get the best possible results, the tagger requires preprocessing by a Czech morphological module with a very high coverage. This module covers a superset of the Czech "FM" morphology. Both the morphological module and the tagger are supplied as binary executables, together with all necessary precompiled Czech data. Input must be in the ISO Latin 2 (iso-8859-2) code and follow the csts.dtd definition, and output is produced in the same way (ISO Latin 2 code, csts.dtd). (As is the case with many of the tools provided with PDT 1.0, both executables also accept - and then produce - a "simplified SGML", which is not a real, valid SGML, but simply contains at least the tags for words, punctuation, and sentence breaks, one item per line.)
dc.publisher	Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.rights	PDT 2.0 License
dc.rights.uri	https://lindat.mff.cuni.cz/repository/xmlui/page/license-pdt2
dc.source.uri	http://ufal.mff.cuni.cz/pdt2.0/doc/tools/machine-annotation/index.html#a-ma-tagging
dc.subject	morphology
dc.subject	tagger
dc.title	Feature-based tagger
dc.type	toolService
dc.rights.label	ACA
has.files	yes
branding	LINDAT / CLARIAH-CZ
demo.uri	http://lindat.mff.cuni.cz/services/morph/
files.size	6843519
files.count	3