Title: A New Technique for Automatic Text Categorization for Arabic Documents

Abstract
: Due to the wide spread of Arabic documents on the Internet, it becomes an urgent necessity to build systems that manipulate Arabic documents. In this paper, we propose a new technique of an automatic text categorization for Arabic documents based on light stemming algorithm; which removes suffixes and prefixes from words. Despite the complexity of Arabic language, our technique shows a very significant F-measure varying between 0.88 and 0.99 with an average of 0.92.

Authors
: Samir A. Mohamed, Walaa Ata and Nevin Darwish


Back


IBIMA 2005 Conference   www.ibima.org