You are on page 1of 22

Project Report On: Online Hand Written Word Recognition

-:Index:Part 1:- Introduction 1.1: Historical Background 1.2: Bangla Character Set Part 2:- Steps of online character Recognition

Part 3:- Feature Extraction and evaluate the score value 3.1: Feature Extraction 3.2: Evaluate the score value 3.3: Uses of score value and proposed methodology Part 4:- Trie data structure 4.1: An introduction to trie 4.2: Advantage of trie 4.3: Application of trie 4.4: Application of trie in our project Part 5:- Implmentation 5.1: Our Implmentation Technique 5.2: Methodology 5.3: Implementation

Part 6:- References

1. INTRODUCTION Our work is on Bengali word. So we first describe Bengali language briefly. 1.1) HISTORICAL BACKGROUND Bangla is an Indo-Aryan language of the eastern Indian subcontinent, evolved from the Magadhi Prakrit, Pali and Sanskrit languages. Bengali is native to the region of eastern South Asia known as Bengal, which comprises the country Bangladesh and the Indian state of West Bengal. With nearly 300 million native speakers, it is the fourth most popularly spoken languages in the world. Bengali is the main spoken language of Bangladesh and the second most spoken language of India.

1.2) Bangla Character Set Basic Characters: There are of 11 vowels (shown in Fig. 1) and 39 consonants (shown in Fig. 2) in the alphabet of Bangla. These 50 symbols together form the basic character set of Bangla.

Figure 1. Symbols of 11 vowels of Bangla

Figure 2. Symbols of 39 consonants of Bangla Although the consonant symbols are included in the basic character set of Bengla script, they are actually orthographically syllabic in nature. Every consonant sign has the vowel [a] (or sometimes the vowel consonant sign [o]) "embedded" or "inherent" in it. For example, the basic ] in isolation. The same can represent the sounds , with is pronounced [ +

[ + ] when used in a word, as in the word no added symbol for the vowels and .

or [ + ] as in the word

To emphatically represent a consonant sound without any inherent vowel attached to it, a special diacritic, called the hshonto ( ), may be added below the basic consonant sign (as in ). This diacritic, however, is not common, and is chiefly employed as a guide to pronunciation. Vowel Modifiers: A consonant sound followed by some vowel sound other than and in some cases is orthographically realized by using a variety of vowel allographs above, below, before, after, or around the consonant sign, thus forming the ubiquitous consonant-vowel ligature. These allographs, called vowel modifiers are dependent vowel forms and cannot appear on their own. For example, the consonant (I), (u), (U), (R^i), (e), (ai), when combined with the vowels (o), (A), (i),

(au), respectively are represented by the

respective vowel modifiers along with the consonant symbol as shown in Figure 3. It should be noted that in these consonant-vowel ligatures, the so-called "inherent" vowel is expunged from the consonant, but the basic consonant sign does not indicate this change.

Figure 3. Shapes of different vowels when combined with a consonant (vowel modifiers)

From the above discussions it is clear that the all the vowels in Bengla except the first one ( ) take two different forms: the independent form found in the basic character set of the script and the dependent allograph form (as discussed above). To represent a vowel in isolation from any preceding or following consonant, the independent form of the vowel is used. On the other hand, the vowel modifiers character and when attached to the consonant take different modified shapes in contrast to the modified shapes of these two

vowels shown in Figure 3. Such modified shapes are shown in Figure 4 (a). Also, in several situations, the vowel shown in Figure 4(b). (i) when gets attached to a consonant, a new form is created as

Figure 4.(a)Variants of modified shapes of the vowel (i) and

(I), (b) Often the shape of (u).

a consonant character gets changed due to the adjacent vowel

Compound Characters: The Bengali consonant clusters (commonly called juktakkhor in Bangla or compound character in English) are usually realized as ligatures, where the consonant which comes first is put on top of or to the left of the one that immediately follows. In these ligatures, the shapes of the constituent consonant signs are often contracted and sometimes even distorted beyond recognition. There are more than 400 such consonant clusters and

corresponding ligatures in Bengali. Three other commonly used diacritics (included in the basic character set) in the Bengali are the superposed chndrobindu ( vowels (as in ), denoting a suprasegmental for nasalization of )

), the postposed onushshr ( ) indicating the velar nasal (as in ).

and the postposed bishrgo ( ) indicating the voiceless glottal fricative (as in Complexity of Bangla Script:

Clusters of consonants (compound characters) are represented by different and sometimes quite irregular characters; thus, the script is complicated by the sheer size of the full set of characters and character combinations, numbering about 500. A few of such compound characters are shown in Figure 5. While efforts at standardizing the script for the Bengali language continue in several centers, still many people continue to use various archaic forms of letters, resulting in concurrent forms for the same sounds.

Figure 5: Several compound characters

2. Steps of Online Character Recognition There are 4 major steps in Online Character Recognition system. 1. Preprocessing. 2. Feature Extraction. 3. Recognition (Using Language Model). The Preprocessing step prepares the document image for feature extraction. It has several sub-steps.

a) Noise reduction: The image is passed through a smoothing process to remove the
noise in it. We have used a non linear smoothing algorithm for this.

b) Elimination of redundant data: We eliminate the same co-ordinate (x, y) from the
word data file.

3. Feature Extraction and evaluate score value 3.1) Feature Extraction At first, lets see how a child learns to write. She/he is advised to draw the pen at some position, make straight/curved pen movement in a Particular direction, create loops when needed and lift the pen at some other position. Pen down position for the next stroke is also mentioned and she or he follows such instructions until the character is complete. Two such characters writing style is shown below. Apart from pen up/down positions and the directions of pen movement the children also taught the relative lengths of the different parts of a stroke. These aspects are captured and used as features.

Writing style: (a) Devnagari numeral one (1) and (b) bangla basic character, written by two stroke

Additionally, pen up and pen down information is captured to distinguish the stroke between pen-down and pen-up. At first, we extract angle variation information as follows: i. ii. distance(
i

)=

angle(i)=

It is to be noted that I and

are the angle and the

Euclidean distance, respectively between one point and its next point. In reality, instead of angle variation information direction information is taught to a child as maintained before. Thats why we convert each I into a direction code (an integer) like 8 directions Freeman coding as above .The following expression converts I into an 8 direction code, Di Di = Where mod is the modulus operator and int return integer part of a real number. For simplicity in our program we normalize the distance as follows 1) First we measure the distance between two points accordingly. 2) Then measure the total distance. 3) Next divide individual distance by total distance and then multiply it with a large integer (1024). We get feature vector of a character with the help of normalize distance and direction code. 3.2) Evaluate the score value For the purpose we use two types of data 1) Training data 2) Testing data Testing data are those which are use for the test the code. Training data is stored in data base and it is used for recognition of testing data and from which we get the score value. We evaluate the score value from the following equation: 4-mod (4-mod (known_featurei - testing_featurei )) Where i = 0,1,2,3,4,5,6,..,1023

We get known_featutei from feature extraction of training data and get testing_featurei from testing data. 3.3) Uses of score value and proposed methodology The input to the recognition system is a single word [fig: 1] which is segmented into few parts (over segmentation occur). After segmentation [fig: 2], this approach treats the word as a collection of character or pieces of character.

Fig 1: Word

Fig 2: Segmented Word

At moment a key question is how to use this segmented part in recognition system. We consider maximum three segmented part form a character. Using all segmented part, we take all the possible way to get a matching word [fig: 3].

Fig: 3 Segmentation Graph [Assume maximum 3 segmented part form a character]

We extract the feature of trained character, each segmented part of each possible ways on basis of angle & Euclidian distance of pen movement. Each segmented part of each possible ways is passed to character recognizer. These character hypotheses can represent part of a character, a full character, a few characters, or part of a character combined with part of another characters. Each Segmented part compare with the reference patterns (trained Character) to determine their similarity to decide which pattern or model best represents the Character being recognized. The observation score for each part is normally obtained by using a nearest neighbor classifier. Minimum Score should be consider to recognize it as a full character (may be wrong).

4. Trie data structure The trie data structure is included in the curriculum of most computer science programs. The origin of the name is from the middle section of the word "retrieval", and this origin hints on its usage: information retrieval systems. Program designers use this data structure to build systems that can extract information in a computational complexity order of one, which is, of course, the best that one can have.

4.1) An introduction to trie In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree shows what key it is associated with. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Values are normally not associated with every node, only with leaves and some inner nodes that correspond to keys of interest. A trie (from retrieval), is a multi-way tree structure useful for storing strings over an alphabet. It has been used to store large dictionaries of English (say) words in spelling-checking programs and in natural-language "understanding" programs. Given the data: an, ant, all, allot, alloy, aloe, are, ate, be the corresponding trie would be:

The idea is that all strings sharing a common stem or prefix hang off a common node. When the strings are words over {a...z}, a node has at most 27 children - one for each letter plus a terminator. We implement binary tree instead of n-ary tree for making the program more flexible. Given the data: an, ant, all, allot, alloy, aloe, are, ate, be the corresponding binary trie would be:

Conversion of n-ary tree to binary tree

4.2) ADVANTAGES RELATIVE TO BINARY SEARCH TREE:The following are the main advantages of tries over binary search trees (BSTs):

Looking up keys is faster. Looking up a key of length m takes worst case O (m) time. A BST performs O (log (n)) comparisons of keys, where n is the number of elements in the tree, because lookups depend on the depth of the tree, which is logarithmic in the number of keys if the tree is balanced. Hence in the worst case, a BST takes O (m log n) time. Moreover, in the worst case log(n) will approach m. Also, the simple operations tries use during lookup, such as array indexing using a character, are fast on real machines. Tries can require less space when they contain a large number of short strings, because the keys are not stored explicitly and nodes are shared between keys with common initial subsequences. Tries facilitate longest-prefix matching, helping to find the key sharing the longest possible prefix of characters all unique.

4.3) Applications As replacement of other data structures As mentioned, a trie has a number of advantages over binary search trees. A trie can also be used to replace a hash table, over which it has the following advantages:

Looking up data in a trie is faster in the worst case, O(m) time, compared to an imperfect hash table. An imperfect hash table can have key collisions. A key collision is the hash function mapping of different keys to the same position in a hash table. The worst-case lookup speed in an imperfect hash table is O(N) time, but far more typically is O(1), with O(m) time spent evaluating the hash. There are no collisions of different keys in a trie. Buckets in a trie which are analogous to hash table buckets that store key collisions are only necessary if a single key is associated with more than one value. There is no need to provide a hash function or to change hash functions as more keys are added to a trie. A trie can provide an alphabetical ordering of the entries by key.

Tries do have some drawbacks as well:

Tries can be slower in some cases than hash tables for looking up data, especially if the data is directly accessed on a hard disk drive or some other secondary storage device where the random access time is high compared to main memory. It is not easy to represent all keys as strings, such as floating point numbers, which can have multiple string representations for the same floating point number, e.g. 1, 1.0, 1.00, +1.0, etc.

Dictionary representation A common application of a trie is storing a dictionary, such as one found on a mobile telephone. Such applications take advantage of a trie's ability to quickly search for, insert, and delete entries; however, if storing dictionary words is all that is required (i.e. storage of information auxiliary to each word is not required), a minimal acyclic deterministic finite automaton would use less space than a trie. Tries are also well suited for implementing approximate matching algorithms, including those used in spell checking software. Sorting Lexicographic sorting of a set of keys can be accomplished with a simple trie-based algorithm as follows:

Insert all keys in a trie. Output all keys in the trie by means of pre-order traversal, which results in output that is in lexicographically increasing order. Pre-order traversal is a kind of depthfirst traversal. In-order traversal is another kind of depth-first traversal that is more appropriate for outputting the values that are in a binary search tree rather than a trie.

Full text search A special kind of trie, called a suffix tree, can be used to index all suffixes in a text in order to carry out fast full text searches. 4.4: Application of trie in our project We assume a numeric unique number for each Bengali letter. For example Bengali word represent as 20 and 47. 20 are for letter and 47 are for search trie which contain 245 Bengali words. . We construct a binary

5. Implementation 5.1) Our Implementation technique From training data we construct the binary trie. For a particular word, written by any writer, computer determines all possible word combination respect to the score value. For example for a particular word EBONG written by a particular writer we find all possible word combination as follows:

From that we check the words one by one whether the word, having a marker is in trie. If we can find one, we return the word.

5.2) Methodology We have used a structure named trie with the variables data as integer and two trie type pointer named left and right. To accomplish this task we have divided our task into some subtask. We have used

the functions named create() to create a trie type node, insert() to insert a node in the existing tree, marker() to mark a words existence in the tree and search() to search whether a given word is in the tree or not. We also have used the function ceratez() to create the tree. The algorithm that we have used to define createz() is given below, createz() Begin { Scan the contains of file, containing the data assumed from the stroke. Read the word and call the function insert() with the arguments word and the length of the word. } End The algorithm of insert is given below, Insert(word, word length) // root is the root node of the tree. Begin { 1. For every letter in word repeat step 1.1 to 1.8 1.1> check whether the root is assigned to null or not. 1.2> If so create the root node 1.3> Else check whether it is the first letter of the word or not 1.4> If it is, search the right nodes of the root. 1.5> If the letter matched with any existing node's data mark it with ptr. //ptr is of type trie 1.6> Otherwise create a node with this data at the extreme right of the root node. And mark it with ptr. 1.7> If the letter is not the first one then move the ptr to left position of the ptr. 1.8> Then follow steps 1.4 to 1.6. 2. Call the function marker with the argument as ptr to mark the extreme right of the stem. } End The algorithm of search() is given below, search (word , a) // word is the searching word, // a be length of the word Begin {

1. For every letter in word repeat step 1.1 to 1.7 1.1> Set flag to 0 1.2> check whether it is the first letter of the word or not 1.3> If it is, search the right nodes of the root. 1.4> If the letter matched with any existing node's data, mark it with ptr. And set the flag to 1 and increment length by 1 and fscore by finalscore[letter no]. 1.5> If the letter is not the first one then move the ptr to left position of the ptr. 1.6> search the right nodes of the ptr. 1.7> If the letter matched with any existing node's data, mark it with ptr. And set the flag to 1 and increment length by 1 and fscore by finalscore[letter no]. 2. return 1 if flag is 1, else return 0. } End. Now for check the word which is correct, we have a main program. Above functions are put together in a header file. We include this file in our main program. In the main program we first open the file which consists all possible letter combination of many words and which is written by many writer. We put letter combination of a word in atrie and check whether it is present or not in a trie. If present we return the word. 5.3) Implementation in VC++ trie.h: #include<cstdio> #include<stdlib.h> #include <cstdio> #include<iostream> #include<cstring> using namespace std; int search(int *,int ); void createz(); typedef struct trie { struct trie *left; struct trie *right; int data; }tri; tri *root=NULL; tri *create(int a) {

tri *ptr; ptr=(tri *)malloc(sizeof(tri)); ptr->data=a; ptr->left=NULL; ptr->right=NULL; return(ptr); } int search( int *word,int a) { int i,flag; tri *temp,*ptr; int j=0; temp=root; for(i=0;i<a;i++) { flag = 0; if(i) temp = temp -> left; while( temp!=NULL ) { if(temp->data==word[i]) { flag = 1; j++; break; } else { temp=temp->right; } } if(!flag) break; } if(j<a) return j; if ( flag ) { temp=temp->left; while( temp != NULL ) { if( temp->data == 1000 && temp->left == NULL )

{ return a; } temp=temp->right; } } if(flag == 0 ) { return 0; } } void marker( tri **ptr) { tri *str; str=(tri *)malloc(sizeof(tri)); str->data=1000; str->left=NULL; str->right=NULL; if ( (*ptr)->left == NULL ) { (*ptr)->left= str; } else { (*ptr) = (*ptr)->left; while( (*ptr) != NULL ) { if ( (*ptr)->right == NULL ) { (*ptr)->right=str; break; } (*ptr)=(*ptr)->right; } } } void insert(int *word,int length) { tri *temp,*nodeptr,*parentptr; int f1=0,f2=0,i; for(i=0;i<length;i++)

{ if(root==NULL) { root=create(word[i]); nodeptr=root; } else if(i==0 && root!=NULL) { temp=root; while(temp!=NULL) { parentptr=temp; if(temp->data==word[i]) { nodeptr=temp; f1=1; break; } else temp=temp->right; } if(f1==0) { parentptr->right=create(word[i]); nodeptr=parentptr->right; } f1=0; } else { temp=nodeptr->left; if(temp==NULL) { nodeptr->left=create(word[i]); nodeptr=nodeptr->left; } else if(temp->data==word[i]) { nodeptr=nodeptr->left; continue; } else if(temp->data!=word[i]) { while(temp!=NULL) {

parentptr=temp; if(temp->data==word[i]) { nodeptr=temp; f2=1; break; } else temp=temp->right; } if(f2==0) { parentptr->right=create(word[i]); nodeptr=parentptr->right; } } f2=0; } } marker(&nodeptr); } void createz() { FILE *fp; int *word,*word1=NULL; int a,b,i,j,count=0; char s[25]; fp=fopen("word_set.txt","r"); while(!feof(fp)) { count++; fscanf(fp,"%d",&a); word=(int *)malloc(a*sizeof(int)); j=0; for(i=0;i<a;i++) { fscanf(fp,"%d",&b); word[j++]=b; } fscanf(fp,"%s",s); insert(word,j); } }

main.cpp: #include "trie.h" #include <ctype.h> int main(int argc,char* argv[]){ int word[50],length,word1[50]; char string[100]; FILE *fp = fopen("sfinal1.txt","r"); fpos_t pos; char ifend; int i = 0; int max = 0,j; char ifdigit; createz(); while(1){ if(feof(fp)){ if(max){ for(int h=0;h<max;h++) cout<<word1[h]<<" "; cout<<endl; } break; } fgetpos(fp,&pos); fscanf(fp,"%c",&ifdigit); if(ifdigit == '\t' || ifdigit == ' ') continue; if(isalpha(ifdigit)){ if(max){ for(int h=0;h<max;h++) cout<<word1[h]<<" "; cout<<endl; } max = 0; fsetpos(fp,&pos); fscanf(fp,"%[^\n]\n",string); cout<<string<<endl; continue; } if(ifdigit == 10){ length = i; scanf("%[\0]",&ifend); j = search(word,length); if(max<j){ max = j; for(int s=0; s<max; s++)

word1[s] = word[s]; } i = 0; }else{ fsetpos(fp,&pos); fscanf(fp,"%d",&word[i]); i++; } } }

6. References o www.wikipedia.com o Online Handwritten Indian Script Recognition document by U.Garain, B.B.Chaudhuri, T.T.Pal.

You might also like