Skip to search
Skip to main content
Skip to first result
Search
Search Results
Publisher:
Academy of Sciences
Type:
corpus
Language:
Hungarian
Description:
BSI is a large-scale survey which provides reliable data on and analyses of the varieties of Hungarian spoken in Budapest.
Rights:
Not specified
Publisher:
Budapest University of Technology and Economics Media Research (BME MOKK)
Type:
toolService
Description:
Hunalign is a powerful free sentence level aligner for building parallel corpora. Its input is tokenized and sentence-segmented text in two languages.
Rights:
Not specified
Publisher:
Academy of Sciences
Format:
application/xml
Type:
corpus
Language:
Hungarian
Description:
Containing 27 million running words the Hungarian Historical Corpus provides a valuable basis for research on the history of words of Hungarian between the second half of the 18th century and 2000.
Rights:
Not specified
Publisher:
Dept. of Language Technology, Research Institute for Linguistics
Type:
toolService
Language:
Hungarian
Description:
NooJ is a linguistic development environment that includes large-coverage dictionaries and grammars, and parses corpora in real time. The large-coverage lexical resources (morphological and syntactic grammars) for Hungarian might be applied to texts in order to locate morphological, lexical and syntactic patterns and tag simple and compound words.
Rights:
Not specified
Publisher:
Academy of Sciences
Format:
application/xml
Type:
corpus
Subject:
synchronic corpus
Language:
Hungarian
Description:
Written general synchronic reference corpus; 190m tokens; POS annotated XML
Rights:
Not specified
Publisher:
Budapest University of Technology and Economics Media Research (BME MOKK)
Type:
corpus
Subject:
Web corpus
Language:
Hungarian
Description:
Monolingual written general; 700 million tokens; Segmentation, disambiguation
Rights:
Not specified
Publisher:
Academy of Sciences and Budapest University of Technology and Economics Media Research (BME MOKK)
Type:
corpus
Subject:
parallel corpus
Language:
English and Hungarian
Description:
Billingual written general; 2 million sentences
Rights:
CC
Creator:
Varga, Dániel and Simon, Eszter
Publisher:
Budapest Technical University Media Research Centre
Type:
toolService
Description:
Hungarian named entity recognition with a maximum entropy approach
Rights:
Not specified
Publisher:
Budapest University of Technology and Economics Media Research (BME MOKK)
Type:
toolService
Description:
Hunpos is an open source reimplementation of TnT, the well known part-of-speech tagger by Thorsten Brants.
Rights:
Not specified
Creator:
Németh, László , Halácsy, Péter , and Kornai, András
Publisher:
Budapest Technical University Media Research Centre
Type:
toolService
Subject:
tokenizer
Description:
HunToken is a rule based tokenizer and sentence boundary detector for Hungarian (and English) texts.
Rights:
GNU Library or "Lesser" General Public License 3.0 (LGPL-3.0) , http://opensource.org/licenses/LGPL-3.0 , and PUB