Segmentation and Recognition of Printed Arabic Characters by Structural Classifications

Date

1997-2

Type

Conference paper

Conference title

Image and Vision Computing

Issue

Vol. 13 No. 12

Author(s)

B M F Bushofa
M Spann

Abstract

Arabic characters differ significantly from other characters, such as Latin and Chinese characters, in that they are written cursively in both printed and handwritten forms, and consist of 28 main characters. However, most of their shapes change according to their position in the word. These shapes, together with some other secondaries, raise the number of classes to 120. Furthermore, some of these characters have the same shape but are distinguished by the presence of one, two or three dots above or below them. In this paper, words are first segmented into characters and secondaries are removed using newly developed algorithms. This reduced the number of classes to 32. Information about these secondaries, such as their number, position and type, is recorded and used in the final recognition stage. Features of the skeletonized character are used for classification using a decision tree. A recognition rate of 97.23% over a set of 4260 samples is achieved.

Publisher's website

View