Title: An Algorithm for Extracting the Root for the Arabic Language

Abstract
: Stemming is one of many tools used in information retrieval (IR) to combat the vocabulary mismatch problem, in which query words do not match document words. Stemming in the Arabic language  does not  fit into the usual mold, because stemming in most research in other languages so far  depends only on eliminating prefixes and suffixes  from the word, but  Arabic words contain infixes as well. In this paper we introduce a root-based algorithm that handles the problems of affixes, including prefixes, suffixes, and infixes depending on the morphological pattern of the word. In this paper we will use the stemming concept to eliminate for eliminating all kinds of  affixes, including infixes.

Authors: Sameh Ghawanmeh, Riyad Al-Shalabi, Ghassan Kanaan, Khalid Khanfar, and Saif Rabab’ah

Back


IBIMA 2005 Conference   www.ibima.org