The evolution in Information Technology has
gone a long way of bringing Igbo, one of the major
Nigerian languages evolved. Some online service
providers report news, publish articles and search with
this language. The advancement will likely result to
generation of huge textual data in the language, that needs
to be organized, managed and classified efficiently for
easy information access, extraction and retrieval by the
end users. This work presents an enhanced model for
Igbo text classification. The classification was based on Ngram and K-Nearest Neighbour techniques. Considering
the peculiarities in Igbo language, N-gram model was
adopted for the text representation. The text was
represented with Unigram, Bigram and Trigram
techniques. The classification of the represented text was
done using the K-Nearest Neighbour technique. The
model is implemented with the Python programming
language together with the tools from Natural Language
Toolkit (NLTK). The evaluation of the Igbo text
classification system performance was done by calculating
the recall, precision and F1-measure on N-gram
represented text. The result shows text classification on
bigram represented Igbo text has highest degree of
exactness (precision); trigram has the lowest level of
precision and result obtained with the three N-gram
techniques has the same level of completeness (recall).
Bigram text representation technique is extremely
recommended for any text-based system in Igbo. This
model can be adopted in text analysis, text mining,
information retrieval, natural language processing and
any intelligent text-based system in the language.
Keywords : Igbo Language; Text Classification; Text Mining; K-Nearest Neighbour; N-Gram; Similarity Measure.