dc.date.accessioned | 2014-07-30T21:35:35Z |
dc.date.available | 2014-07-30T21:35:35Z |
dc.date.issued | 2014-07-30 |
dc.identifier.uri | http://hdl.handle.net/11372/LRT-1413 |
dc.description | Text preprocess (this preprocess service requires that the input text be in plain text format (file .txt) and UTF-8). Basically, it carries out: (i) text segmentation into minor structural units (titles, paragraphs, sentences, etc.); (ii) detection of entities not found in dictionaries (numbers, abbreviations, URLs, emails, proper nouns, etc.); and (iii) the keeping of sequences of two or more words in a single block (dates, phrases, proper nouns, etc.). |
dc.publisher | Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra |
dc.title | iula_preprocess |
dc.type | toolService |
has.files | no |
additional.metadata | Language(s) of input data (field_tool_input_language):Catalan||Spanish Readily Available (field_tool_available):Readily available Webservice link (field_tool_webservice_link):http://kurwenal.upf.edu:8080/soaplab2-axis/ Availibility (field_tool_availibility):Readily available Nid:3337 Character encoding of input data (field_tool_char_encoding):Unicode (UTF-8) |
branding | LRT + Open Submissions |
dc.coverage.placeName | Spain |
files.size | 0 |
files.count | 0 |