Title: A New Technique
for Automatic Text Categorization for Arabic Documents
Abstract: Due to the
wide spread of Arabic documents on the Internet, it becomes an urgent necessity
to build systems that manipulate Arabic documents. In this paper, we propose
a new technique of an automatic text categorization for Arabic documents based
on light stemming algorithm; which removes suffixes and prefixes from words.
Despite the complexity of Arabic language, our technique shows a very significant
F-measure varying between 0.88 and 0.99 with an average of 0.92.
Authors: Samir A. Mohamed,
Walaa Ata and Nevin Darwish