Zobrazit minimální záznam
dc.date.accessioned |
2014-07-30T21:35:35Z |
dc.date.available |
2014-07-30T21:35:35Z |
dc.date.issued |
2014-07-30 |
dc.identifier.uri |
http://hdl.handle.net/11372/LRT-1413 |
dc.description |
Text preprocess (this preprocess service requires that the input text be in plain text format (file .txt) and UTF-8).
Basically, it carries out: (i) text segmentation into minor structural units (titles, paragraphs, sentences, etc.); (ii) detection of entities not found in dictionaries (numbers, abbreviations, URLs, emails, proper nouns, etc.); and (iii) the keeping of sequences of two or more words in a single block (dates, phrases, proper nouns, etc.). |
dc.publisher |
Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra |
dc.title |
iula_preprocess |
dc.type |
toolService |
has.files |
no |
additional.metadata |
Language(s) of input data (field_tool_input_language):Catalan||Spanish
Readily Available (field_tool_available):Readily available
Webservice link (field_tool_webservice_link):http://kurwenal.upf.edu:8080/soaplab2-axis/
Availibility (field_tool_availibility):Readily available
Nid:3337
Character encoding of input data (field_tool_char_encoding):Unicode (UTF-8) |
branding |
LRT + Open Submissions |
dc.coverage.placeName |
Spain |
files.size |
0 |
files.count |
0 |
Zobrazit minimální záznam