This xml file describes the Arabic phonetic constraints are to be applied on Arabic root. The first rule category lists the letters that may not occur in the same root, regardless of their order. The second category lists the letters that may not be used together in a root word with a specific order. The third and fourth categories show that each contiguous letters must not be redundant
ISLRN: 991-445-325-823-5
Description: this xml file describes the Arabic phonetic constraints (rules) resulting from the analysis of the lexicons(Taj Alarous, Al ain, Lisan Al arab, Alwassit and almoassir ). These rules are to be applied to Arabic roots and are classified into a number of categories. Each category has a certain type of constraints as follow: The first category defines that the root must not consist of three identical letters. The second category defines that the root must not start with two repeating letters. The third category lists the letters that must not occur in the same root, regardless of their order. The fourth category lists the letters that may not be used together in a certain order in a root.
ISLRN: 190-535-098-473-3
Description: This xml file is a lexicon containing all 21952 (28x28x28) Arabic triliteral combinations (roots). the file is split into three parts as follow: the first part contains the phonetic constraints that must be taken into account in the formation of Arabic roots (for more details see all_phonetic_rules.xml in http://arabic.emi.ac.ma/alelm/?q=Resources). the second part contains the lexicons that were used to create this lexicon (see in lexicons tag). the third part contains the roots.
ISLRN: 813-907-570-946-2
This improved version is an extension of the original Arabic Wordnet (http://globalwordnet.org/arabic-wordnet/awn-browser/), it was enriched by new verbs, nouns including the broken plurals that is a specific form for Arabic words.
An LMF conformant XML-based file containing a comprehensive Arabic broken plural list. The file contains 12,249 singular words with their corresponding BPs
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.