Fig 1 - uploaded by Diaa Fayed
Content may be subject to copyright.
Entries examples of the Al-Mawrid.

Entries examples of the Al-Mawrid.

Source publication
Conference Paper
Full-text available
Natural language processing (NLP) applications need large and rich amount of linguistic knowledge. Furthermore, electronic language sources such as dictionaries, encyclopedia, and corpora became available. So, automatic methods are emerged to extract lexical information from those sources to overcome the knowledge acquisition bottleneck. We present...

Context in source publication

Context 1
... in the subentry. On the other hand, implicit information is not assigned to additional information, so the purpose can be inferred or deduced. Some analyses are needed for text definitions before extracting implicit information. The information that is to be extracted exists in the defining phrases of the explanation field or the header field. Fig. 1 shows some entries of the ...

Citations

... A dash may precede the cross-reference field. A subentry may express declaration, question, or exclamation [14,15]. ...
... The Al-Mawrid is not annotated by part-of-speech (POS) tags. Only very low number of two part-of-speech tags (approximately fifteen tags of nouns and adjectives) exists [14,15]. ...
... We implemented the proposed algorithm in python and used the WordNet database that is implemented in Natural Language Toolkit (NLTK) [21]. In addition to preprocessing of the Al-Mawrid data in Diaa et al. [15,16], we made additional preprocessing to the translation equivalences before used them in querying the WordNet API. Table 3 shows some of those modifications. ...
Conference Paper
Full-text available
This paper proposed an algorithm for part-of-speech (POS) tagging senses of a bilingual dictionary. The algorithm is applied on the Al-Mawrid Arabic-English dictionary. The tagging task is accomplished by transferring the POS tags of the English translation equivalences (TEs) to the dictionary senses after dis-ambiguities process. The English POS tags of senses are acquired from the Princeton WordNet. POS tagging of bilingual dictionary senses is prerequisite to link a bilingual dictionary to WordNet and/or standardizing that dictionary into WordNet-LMF format where the synset (set of synonyms), not word, is the basic brick. The registered accuracy is high though the cost is little. Building NLP/HLT tools needs linguistic experts, large investments, and long time. For statistical approach, we need large annotated corpora and for rule-based approach, we need large lexicon that contains rich linguistic and world knowledge. That motivates the appearance of what are called resource-light approaches to develop natural language processing (NLP) tools for poor-resource languages.