Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Gurevych, Iryna , Habernal, Ivan , and Zayed, Omnia
Publisher:
Technische Universität Darmstadt
Type:
text and corpus
Subject:
CommonCrawl , Creative Commons , Web corpus , and Amazon Web Services
Language:
Afrikaans , Arabic , Bengali , Bulgarian , Czech , Danish , German , Modern Greek (1453-) , English , Estonian , Persian , Finnish , French , Gujarati , Hebrew , Hindi , Croatian , Hungarian , Indonesian , Italian , Japanese , Korean , Latvian , Lithuanian , Malayalam , Marathi , Macedonian , Nepali (macrolanguage) , Dutch , Norwegian , Polish , Portuguese , Romanian , Russian , Slovak , Slovenian , Somali , Spanish , Albanian , Swahili (macrolanguage) , Swedish , Tamil , Telugu , Tagalog , Thai , Turkish , Ukrainian , Undetermined , Urdu , Vietnamese , and Chinese
Description:
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) , http://creativecommons.org/licenses/by-nc-sa/4.0/ , and PUB
Creator:
Gurevych, Iryna , Habernal, Ivan , and Zayed, Omnia
Publisher:
Technische Universität Darmstadt
Type:
text and corpus
Subject:
CommonCrawl , Creative Commons , Web corpus , and Amazon Web Services
Language:
Afrikaans , Arabic , Bengali , Bulgarian , Czech , Danish , German , Modern Greek (1453-) , English , Estonian , Persian , Finnish , French , Gujarati , Hebrew , Hindi , Croatian , Hungarian , Indonesian , Italian , Japanese , Korean , Latvian , Lithuanian , Malayalam , Macedonian , Dutch , Norwegian , Polish , Portuguese , Romanian , Russian , Slovak , Slovenian , Somali , Spanish , Albanian , Swahili (macrolanguage) , Swedish , Tamil , Tagalog , Thai , Turkish , Ukrainian , Undetermined , Vietnamese , and Chinese
Description:
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:
Creative Commons - Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0) , http://creativecommons.org/licenses/by-nc/4.0/ , and PUB
Creator:
Gurevych, Iryna , Habernal, Ivan , and Zayed, Omnia
Publisher:
Technische Universität Darmstadt
Type:
text and corpus
Subject:
CommonCrawl , Creative Commons , Web corpus , and Amazon Web Services
Language:
Afrikaans , Arabic , Bengali , Bulgarian , Czech , Danish , German , Modern Greek (1453-) , English , Estonian , Persian , Finnish , French , Gujarati , Hebrew , Hindi , Croatian , Hungarian , Indonesian , Italian , Japanese , Kannada , Korean , Latvian , Lithuanian , Malayalam , Marathi , Macedonian , Nepali (macrolanguage) , Dutch , Norwegian , Panjabi , Polish , Portuguese , Romanian , Russian , Slovak , Slovenian , Somali , Spanish , Albanian , Swahili (macrolanguage) , Swedish , Tamil , Telugu , Tagalog , Thai , Turkish , Ukrainian , Undetermined , Urdu , Vietnamese , and Chinese
Description:
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Creator:
Gurevych, Iryna , Habernal, Ivan , and Zayed, Omnia
Publisher:
Technische Universität Darmstadt
Type:
text and corpus
Subject:
CommonCrawl , Creative Commons , Web corpus , and Amazon Web Services
Language:
Afrikaans , Arabic , Bengali , Bulgarian , Czech , Danish , German , Modern Greek (1453-) , English , Estonian , Persian , Finnish , French , Gujarati , Hebrew , Hindi , Croatian , Hungarian , Indonesian , Italian , Japanese , Kannada , Korean , Latvian , Lithuanian , Malayalam , Marathi , Macedonian , Nepali (macrolanguage) , Dutch , Norwegian , Panjabi , Polish , Portuguese , Romanian , Russian , Slovak , Slovenian , Somali , Spanish , Albanian , Swahili (macrolanguage) , Swedish , Tamil , Telugu , Tagalog , Thai , Turkish , Ukrainian , Undetermined , Urdu , Vietnamese , and Chinese
Description:
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:
Creative Commons - Attribution 4.0 International (CC BY 4.0) , http://creativecommons.org/licenses/by/4.0/ , and PUB
Creator:
Gurevych, Iryna , Habernal, Ivan , and Zayed, Omnia
Publisher:
Technische Universität Darmstadt
Type:
text and corpus
Subject:
CommonCrawl , Creative Commons , Web corpus , and Amazon Web Services
Language:
Afrikaans , Arabic , Bulgarian , Czech , Danish , German , Modern Greek (1453-) , English , Estonian , Persian , Finnish , French , Croatian , Hungarian , Indonesian , Italian , Japanese , Korean , Latvian , Lithuanian , Dutch , Norwegian , Polish , Portuguese , Russian , Slovenian , Somali , Spanish , Swahili (macrolanguage) , Swedish , Tagalog , Thai , Turkish , Ukrainian , Undetermined , and Vietnamese
Description:
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:
Public Domain Mark (PD) , http://creativecommons.org/publicdomain/mark/1.0/ , and PUB
Creator:
Ivanov, Jordan Nikolov,
Type:
text and studie
Subject:
Dějiny států a území na Balkánském poloostrově , Samuel , panovníci bulharští , osídlení , dějiny osídlení, regionální dějiny , světové dějiny středověku (do r. 1492) , Bulharsko , and města, obce
Language:
Bulgarian
Description:
Lokalizuje sídlo Samuelovo na ostrov Achil v Malé Prěspě
Rights:
unknown
Creator:
Urban, Zdeněk,
Type:
text and sborníky
Subject:
Dějiny Evropy , vztahy česko-bulharské , vztahy kulturní , přehledná zpracování dějin českých zemí (chronologicky) , dějiny vědy, umění, kultury a techniky, kulturní vztahy , Bulharsko , and přehledná zpracování světových dějin (chronologicky)
Language:
Bulgarian
Rights:
unknown
Type:
text and sborníky
Subject:
Sociologie kultury. Kulturní život , Biografie , Češi bulharští , vztahy česko-bulharské , and zahraniční periodika a sborníky
Language:
Bulgarian and Czech
Description:
České a bulharské texty and "I tazi kniga izliza s ljubeznata podkrepa na Češkija centăr v Sofija, i otnovo ce ražda v Universitetskata pečatnica" - titul. list
Rights:
unknown
Type:
text and sborníky
Subject:
Globální společnosti. Sociální struktura. Sociální skupiny , Češi bulharští , vztahy česko-bulharské , and zahraniční periodika a sborníky
Language:
Bulgarian
Rights:
unknown
Type:
text and sborníky
Subject:
Globální společnosti. Sociální struktura. Sociální skupiny , Jireček, Konstantin Josef, , Češi bulharští , vztahy česko-bulharské , zahraniční periodika a sborníky , české země 1848-1918 , Československo 1918-1992 , migrace, vystěhovalectví, kolonizace , Bulharsko , and přehledná zpracování světových dějin (chronologicky)
Language:
Bulgarian
Rights:
unknown