NameTag

NameTag web service is available on http(s)://lindat.mff.cuni.cz/services/nametag/api/.

The web service is freely available. Respect the CC BY-NC-SA licence of the models – explicit written permission of the authors is required for any commercial exploitation of the system. If you use the service, you agree that data obtained by us during such use can be used for further improvements of the systems at UFAL. All comments and reactions are welcome.

API Reference

The NameTag REST API can be accessed directly or via any other web programming tools that support standard HTTP request methods and JSON for output handling.

Service Request Description HTTP Method
models return list of models and supported methods GET/POST
recognize recognize named entities GET/POST
tokenize tokenize supplied text GET/POST

Method models

Return the list of models available in the NameTag REST API, and for each model enumerate methods supported by this models. The default model (used when user supplies no model to a method call) is also returned. The default model (used when user supplies no model to a method call) is also returned – this is guaranteed to be the latest Czech model.

Browser Example

http://lindat.mff.cuni.cz/services/nametag/api/models

Example JSON Response

{
 "models": {
  "czech-140205-cnec2.0": [
  ,"recognize"
  ,"tokenize"
  ]
 ,"czech-140205-cnec2.0-no_tokenizer": [
  ,"recognize"
  ]
 }
,"default_model": "czech-140205-cnec2.0"
}

Method recognize

Recognize named entities as described in the User's Manual. The output format is described later.

ParameterMandatoryData typeDescription
datayesstringInput text in UTF-8.
modelnostringModel to use; see model selection for model matching rules.
inputnostring (untokenized/vertical)Input format to use; default is untokenized.
outputnostring (xml/vertical)Output format to use; default is xml.

Browser Examples

http://lindat.mff.cuni.cz/services/nametag/api/recognize?data=Václav Havel byl prvním prezidentem České republiky.
http://lindat.mff.cuni.cz/services/nametag/api/recognize?data=Václav Havel byl prvním prezidentem České republiky.&output=vertical

Method tokenize

Tokenize the supplied text as described in the User's Manual. The output format is described later.

ParameterMandatoryData typeDescription
datayesstringInput text in UTF-8.
modelnostringModel to use; see model selection for model matching rules.
outputnostring (xml/vertical)Output format to use; default is xml.

Browser Examples

http://lindat.mff.cuni.cz/services/nametag/api/tokenize?data=Václav Havel byl prvním prezidentem České republiky.
http://lindat.mff.cuni.cz/services/nametag/api/tokenize?data=Václav Havel byl prvním prezidentem České republiky.&output=vertical

Common Response Format

The response format of all methods is JSON. Except for the models method, the output JSON has the following structure:

{
 "model": "Model used"
,"acknowledgements": ["URL with acknowledgements", ...]
,"result": "Output text"
}

Model Selection

There are several possibilities how to select required model using the model option:

Note that the last possibility allows using czech or english as models.


Accessing API using Curl

The described API can be comfortably used by curl. Several examples follow:

Passing Input on Command Line (if UTF-8 locale is being used)

curl --data-urlencode 'data=Václav Havel byl prvním prezidentem České republiky.' http://lindat.mff.cuni.cz/services/nametag/api/recognize

Using Files as Input (files must be in UTF-8 encoding)

curl -F 'data=@input_file' http://lindat.mff.cuni.cz/services/nametag/api/recognize

Specifying Additional Parameters

curl -F 'data=@input_file' -F 'output=vertical' http://lindat.mff.cuni.cz/services/nametag/api/recognize

Converting JSON Result to Plain Text

curl -F 'data=@input_file' http://lindat.mff.cuni.cz/services/nametag/api/recognize | PYTHONIOENCODING=utf-8 python -c "import sys,json; sys.stdout.write(json.load(sys.stdin)['result'])"