MorphoDiTa web service is available on
http(s)://lindat.mff.cuni.cz/services/morphodita/api/
.
The web service is freely available for testing. Respect the CC BY-NC-SA licence of the models – explicit written permission of the authors is required for any commercial exploitation of the system. If you use the service, you agree that data obtained by us during such use can be used for further improvements of the systems at UFAL. All comments and reactions are welcome.
The MorphoDiTa REST API can be accessed directly or via any other web programming tools that support standard HTTP request methods and JSON for output handling.
Service Request | Description | HTTP Method |
---|---|---|
models | return list of models and supported methods | GET/POST |
tag | tag supplied text | GET/POST |
analyze | perform morphological analysis of supplied text | GET/POST |
generate | perform morphological generation | GET/POST |
tokenize | tokenize supplied text | GET/POST |
Return the list of models available in the MorphoDiTa REST API, and for each model enumerate methods supported by this models. The default model (used when user supplies no model to a method call) is also returned – this is guaranteed to be the latest Czech model.
http://lindat.mff.cuni.cz/services/morphodita/api/models |
{ "models": { "czech-160310": [ "tag" ,"analyze" ,"generate" ,"tokenize" ] ,"czech-160310-morpho_only": [ "analyze" ,"generate" ,"tokenize" ] } ,"default_model": "czech-160310" }
Tag given text as described in the User's Manual. The response format is described later.
Parameter | Mandatory | Data type | Description |
---|---|---|---|
data | yes | string | Input text in UTF-8. |
model | no | string | Model to use; see model selection for model matching rules. |
guesser | no | string (yes / no ) | Use morphological guesser for unknown words; default yes . |
input | no | string (untokenized / vertical ) | Input format to use; default is untokenized . |
convert_tagset | no | string (pdt_to_conll2009 / strip_lemma_comment / strip_lemma_id ) | Apply specified tag set converter. |
derivation | no | string (none / root / path / tree ) | Apply specified morphological derivation to lemmas; default none . |
output | no | string (json / xml / vertical ) | Output format, default is xml :
|
http://lindat.mff.cuni.cz/services/morphodita/api/tag?data=Děti pojedou k babičce. Už se těší. |
|
http://lindat.mff.cuni.cz/services/morphodita/api/tag?data=Děti pojedou k babičce. Už se těší.&output=json |
Perform morphological analysis of supplied text as described in the User's Manual. The response format is described later.
Parameter | Mandatory | Data type | Description |
---|---|---|---|
data | yes | string | Input text in UTF-8. |
model | no | string | Model to use; see model selection for model matching rules. |
guesser | no | string (yes / no ) | Use morphological guesser for unknown words; default yes . |
input | no | string (untokenized / vertical ) | Input format to use; default is untokenized . |
convert_tagset | no | string (pdt_to_conll2009 / strip_lemma_comment / strip_lemma_id ) | Apply specified tag set converter. |
derivation | no | string (none / root / path / tree ) | Apply specified morphological derivation to lemmas; default none . |
output | no | string (json / xml / vertical ) | Output format, default is xml :
|
http://lindat.mff.cuni.cz/services/morphodita/api/analyze?data=Děti pojedou k babičce. Už se těší. |
|
http://lindat.mff.cuni.cz/services/morphodita/api/analyze?data=Děti pojedou k babičce. Už se těší.&convert_tagset=pdt_to_conll2009&output=json |
Perform morphological generation as described in the User's Manual. The response format is described later.
Parameter | Mandatory | Data type | Description |
---|---|---|---|
data | yes | string | Input text in UTF-8. |
model | no | string | Model to use; see model selection for model matching rules. |
guesser | no | string (yes / no ) | Use morphological guesser for unknown words; default yes . |
convert_tagset | no | string (pdt_to_conll2009 / strip_lemma_comment / strip_lemma_id ) | Apply specified tag set converter. |
output | no | string (json / vertical ) | Output format, default is vertical :
|
http://lindat.mff.cuni.cz/services/morphodita/api/generate?data=dítě%0Ajet%0Ak-1%0Ababička |
|
http://lindat.mff.cuni.cz/services/morphodita/api/generate?data=dítě%0Ajet%0Ak-1%0Ababička&convert_tagset=pdt_to_conll2009&output=json |
Tokenize the supplied text as described in the User's Manual. The response format is described later.
Parameter | Mandatory | Data type | Description |
---|---|---|---|
data | yes | string | Input text in UTF-8. |
model | no | string | Model to use; see model selection for model matching rules. |
output | no | string (json / xml / vertical ) | Output format, default is xml :
|
http://lindat.mff.cuni.cz/services/morphodita/api/tokenize?data=Děti pojedou k babičce. Už se těší. |
|
http://lindat.mff.cuni.cz/services/morphodita/api/tokenize?data=Děti pojedou k babičce. Už se těší.&output=json |
The response format of all methods is
JSON. Except for the
models method, the output JSON has the following structure
(with result_object
being usually a string or an array):
{ "model": "Model used" ,"acknowledgements": ["URL with acknowledgements", ...] ,"result": result_object }
There are several possibilities how to select required model using
the model
option:
model
option is not specified, the default model
(returned by models method) is used – this is
guaranteed to be the latest Czech model.model
option can specify one of the models returned
by the models method.-YYMMDD
format can be left out when
supplying model
option – the latest avilable model will be
used.model
option may be only several first words of model
name. In this case, the latest most suitable model is used. Note that the last possibility allows using czech
or english
as models.
curl
. Several examples follow:
curl --data-urlencode 'data=Děti jedou k babičce. Už se těší.' http://lindat.mff.cuni.cz/services/morphodita/api/tag
curl -F 'data=@input_file' http://lindat.mff.cuni.cz/services/morphodita/api/tag
curl -F 'data=@input_file' -F 'output=vertical' -F 'convert_tagset=strip_lemma_id' http://lindat.mff.cuni.cz/services/morphodita/api/tag
curl -F 'data=@input_file' http://lindat.mff.cuni.cz/services/morphodita/api/tag | PYTHONIOENCODING=utf-8 python -c "import sys,json; sys.stdout.write(json.load(sys.stdin)['result'])"