ing Various Techniques

download ing Various Techniques

of 11

Transcript of ing Various Techniques

  • 7/25/2019 ing Various Techniques

    1/11

    A SURVEY ON HANDWRITTEN RECOGNITION USING VARIOUS TECHNIQUES

    ABSTRACT

    This paper focuses a complete analysis of Handwritten Character Recognition (HCR) and its techniques.

    The handwritten character recognition has been applied in variety of applications like anking sectors! Health care

    industries and many such organi"ations where handwritten documents are dealt with. Handwritten Character

    Recognition is the process of conversion of handwritten te#t into machine readable form. $or handwritten characters

    there are difficulties like it differs from one writer to another! even when same person writes same character there is

    difference in shape! si"e and position of character. %atest research in this area has used different types of method!

    classifiers and features to reduce the comple#ity of recogni"ing handwritten te#t.

    INTRODUCTION

    Handwritten Character Recognition is a process of transforming handwritten te#t into machine e#ecutable

    format. There are mainly three steps in pattern recognition& observation! pattern segmentation and pattern

    classification. Recognition of character has become very interesting topic in pattern recognition for the researchers

    during last few decades. 'n general! handwritten recognition is classified in to two types as online and offline

    recognition methods. ffline handwriting recognition involves the automatic conversion of te#t into an image into

    letter codes which are usable within computer and te#tprocessing applications. The data obtained by this form is

    regarded as a static representation of handwriting. ut! in the online system! the two dimensional coordinates of

    successive points are represented as a function of time and the order of strokes made by the writer are also available.

    ffline character recognition is comparatively more challenging due to shape of characters! great variation of

    character symbol! different handwriting style and document quality.

    *everal applications including mail sorting! bank processing! document reading and postal address

    recognition require offline handwriting recognition systems. +s a result! the offline handwriting recognition

    continues to be an active area of research towards e#ploring the newer techniques that would improve recognition

    accuracy.

  • 7/25/2019 ing Various Techniques

    2/11

    BENEFITS AND ADVANTAGES

    Handwriting recognition plays an important role in the storage and retrieval of crucial

    handwritten information

    DATA STORAGE

    ,any contracts! files and personal records contain typed and handwritten information. This means that

    storing such information requires physical space because those original signatures and notes cannot be electronically

    stored. Handwriting recognition software allows users to translate those signatures and notes into electronic words.

    -lectronic storage of this data requires far less physical space than the storage of physical copies. -lectronic storage

    also requires fewer onhand employees to sort through! organi"e and upkeep the data storage warehouse.

    DATA RETRIEVAL

    hysical data retrieval requires personnel to sort through physical copies of old information. The data must

    have been stored and correctly organi"ed as well as for the proper maintenance and upkeep on the physical copies.

    /ou perform electronic data retrieval by using a file search of specific keywords! such as names or dates.

    Handwriting recognition software allows for old files to be saved in a proper electronic format. ,edical records are

    more available to physicians who rely on old data to diagnose illness and know a patient0s history. /ou can review

    and update old contracts without risking a loss of vital information because of misfiling or physical data corruption.

    KEYWORDS

    Handwritten character! reprocessing! *egmentation! $eature e#traction! Classification

  • 7/25/2019 ing Various Techniques

    3/11

    ARCHITECTURE OF A GENERAL HANDWRITTEN RECOGNITION SYSTEM:

    The ma1or steps involved in recognition of characters include! preprocessing! segmentation! feature e#traction andclassification

    $ig 2 Stages of Ca!a"te! Re"og#$t$o#

    A!"$te"t%!e of a "a!a"te! !e"og#$t$o# s&ste'

    Preprocessing

    Segmentation

    eature E!traction

    "lassi#cation

    Post Processing

  • 7/25/2019 ing Various Techniques

    4/11

    (RE (ROCESSING

    reprocessing can be defined as cleaning the document image and making it appropriate for input to the

    CR engine. ma1or steps under preprocessing are&

    3oise removal

    *kew detection4correction

    inari"ation

    The 3oise introduced by the optical scanning devices in the input leads to poor system performance. These

    imperfections must be removed prior to character Recognition. 3oise can be introduced in an image during image

    acquisition and Transmission. 3oise can be of different types as 5aussian noise! 5amma noise! Rayleigh noise!

    -#ponential noise! 6niform noise! *alt and pepper noise! eriodic noise etc. 3oise can be removed using 'deal

    filters! utterworth filters and 5aussian filters. There is a possibility of rotation of image while scanning. *kew

    detection and correction is used to align the paper document with the coordinate system of scanner. 7arious skew

    detection techniques are pro1ection profiles! connected components! Hough transform! clustering etc. 'n

    inari"ation! color or greyscale image is converted into binary image with the help of thresholding. inary image

    can be achieved using +daptive thresholding! 5lobal thresholding! variable thresholding! stu8s method etc.

    ,orphological operations are also used in preprocessing. 9ilation and -rosion are the morphological operations

    that increase or decrease the image si"e. -rosion makes an ob1ect smaller by eroding away the pi#els from its edges.

    -very ob1ect pi#el that is touching background pi#els is changed into background i#el. However! dilation makes an

    ob1ect larger by adding pi#els around its edges. -very pi#el that is touching an ob1ect pi#el is changed into ob1ect

    pi#el. ther morphological operations are opening and closing.

    SEGMENTATION

  • 7/25/2019 ing Various Techniques

    5/11

    *egmentation is needed since handwritten characters frequently interfere with one another. Common ways

    in which characters can interfere include& overlapping! touching! connected! and intersecting pairs etc. 'n order to

    separate te#t from graphs! images! line! te#t4graphics segmentation is required. The output should be an image

    consisting of te#t only. Character segmentation will separate each character from another. 't is one of the main steps

    especially in cursive scripts where characters are connected together. The isolated characters obtained as a result of

    character segmentation are normali"ed to specific si"e for better accuracy. $eatures are e#tracted from the characters

    with the same si"e in order to provide data uniformity. Christopher -. 9unn and . *. . :ang used a series of region

    finding! grouping! and splitting algorithms. Region finding will identify all the dis1oint regions. The pi#els are

    originally labeled n4ff where ;on< signifies the data areas. 'mage is e#amined pi#el by pi#el until ;on< value is

    found. nce found it is labeled with new region number and its neighbors are searched for additional ;on< value.

    *earch proceeds until no ;on< value is found. The result is that all dis1oint regions will be identified and all pi#els in

    any region will be labeled with a unique number. 5rouping deals with the characters which have separate parts or

    which are broken. + smallest bounding bo# is calculated that completely encloses another region. 'f for any two

    regions the bounding bo# of one region completely encloses another region! then the enclosed region is relabeled to

    the value of the enclosing region. Thus! the resulting region is composed of two dis1oint subregions. This is helpful

    for connecting regions that have been separated due to noise .*plitting deals with touching characters. +nshul ,ehta

    used Heuristic segmentation algorithm which scans the hand written words to identify the valid segmentation points

    between characters. The segmentation is based on locating the arcs between letters! common in handwritten cursive

    script. $or this a histogram of vertical pi#el density is e#amined which may indicate the location of possible

    segmentation points in the word. ther character segmentation approaches are Thinning based method! Contour

    $itting method! Robust *tatistical technique! Hypothesis 7erification! *hape $eature 7ector method etc.

    FEATURE E)TRACTION

    $eature e#traction is finding the set of parameters that define the shape of a character precisely and

    uniquely. $eature e#traction methods are classified into three ma1or groups as.

    *tatistical features.

  • 7/25/2019 ing Various Techniques

    6/11

    5lobal transformation and series e#pansion.

    5eometric and topological features

    *tatistical features represent the image as statistical distribution of points. 7arious methods which use

    statistical features are =oning! Crossings and 9istances! ro1ections etc. 'n global transformation and series

    e#pansion various techniques are $ourier transform! 5abor transform! $ourier 9escriptor! wavelets! moments!

    >arhunen%oeve e#pansion etc. 'n 5eometric and topological features! the structural features like loops! curves!

    lines! Tpoint! cross! opening to the right! opening to the left etc. are used. The various categories are coding

    (freeman chain code)! e#tracting and counting topological structures! graphs and trees. 5eometric features are used

    along with fu""y logic to recogni"e characters. +dnan +min and uttipong ,ahasukhon used structural information

    to e#tract features from a character like reakpoints! 'nflection oint! Cusp oint! *traight %ine! Curve! pen or

    Close %oop etc. reakpoint divides a path into sub paths. 't has two possible conditions'nflection oint (change in

    curvature) and Cusp oint (sharp change in direction).*traight line has two points in sequence in a path. pen curve

    is as in letter ;*

  • 7/25/2019 ing Various Techniques

    7/11

    diagonal pro1ection is computed simply by grouping pi#els by the two diagonal lines. The values of each pro1ection

    are normali"ed to a range ?A2@ through the division by the ma#imum value. The normali"ed features are

    concatenated in a single vector containing 2DF features. 'n ,ulti =oning an , # 3 character image is divided into

    several subimages and the percentage of black pi#els in each *ubimage is used as feature. 't is a statistical

    approach as features are calculated based on the number of pi#els used to represent an image. ther feature

    e#traction algorithms used are Concavities ,easurement! ,+Tbased 5radient 9irectional features! 5radient

    9irectional features! ,edian 5radient features! Camastra G9 features.

    CLASSIFICATION

    The classification is the process of identifying each character and assigning to it the correct character class.

    The classification techniques can be categori"ed as&

    Classical techniques.

    *oft computing techniques.

    The various classical techniques are Template matching! *tatistical techniques! *tructural techniques.

    :hereas the various soft computing techniques are 3eural networks! $u""y logic! -volutionary computing

    techniques. +dnan +min and :. H. :ilson used 3eural network for classification of characters with three layers

    namely 'nput layer! utput layer and Hidden layer. The geometric features e#tracted like dot! line! curve or loops are

    given as input to the input layer. -ach component of the segmented representation is classified as a dot! line! curve!

    or loop. 'n each case! the characteristics of the component are determined& if a line! what are its orientation and its

    si"e relative to the character frame short! medium or long. ne input neuron is used to encode each of these

    possible choices (short4medium4long) and each of four possible orientations for a line. ne input neuron is used to

    encode the characteristics of each component e#tracted by geometric feature e#traction technique. 3euron has two

    modes of operations as training mode and testing mode.

    'n the training mode! the neuron can be trained to fire (or not)! for particular input patterns. 'n the testing mode!

    when a taught input pattern is detected at the input! its associated output becomes the current output. 'f the input

  • 7/25/2019 ing Various Techniques

    8/11

    pattern does not belong in the taught list of input patterns! the firing rule is used to determine whether to fire or not.

    +nshul ,ehta! ,anisha *rivastava used three networks for the recognition of DI lowercase and DI upper case letters

    as ,ultilayer erception(,%)! Radial asis $unction (R$) and *upport 7ector ,achine(*7,).,ultilayer

    perception is a feed forward neural network with one or more layers between input and output layer. Radial basis

    function (R$) networks typically have three layers& an input layer! a hidden layer with a nonlinear R$ activation

    function and a linear output layer.

    (OST*(ROCESSING

    ostprocessing mainly consists of two tasks Joutput string generation and error detection4correction.

    utput string generation will reassemble the strings which have been separated in the process of segmentation

    whereas error detection4correction will correct errors with the help of dictionary.

    ANALYSIS OF HANDWRITTEN ALGORITHMS

    Resea!"e! Dataset (!e*(!o"ess$#g Seg'e#tat$o# Feat%!e C+ass$f$"at$o#

    A""%!a"&E,t!a"t$o# - Re"og#$t$o#

    Hori"ontal$eed forward

    histogramback

    +.5eorge profile! Contourlet propagation KL.GM

    et.al.?G@ vertical transformneural networkhistogram

    algorithmprofile

    =oning

    ,edian 9ensity (=9)

    and ack *7, (supportfiltration!

    >.*ingh LAAA ground vectordilation! some KE.AM

    et.al.?@ samples 9irectional machines)morphological

    9istribution classifieroperations

    (99)features

    +daptivehistogram $eature

    equali"ation vector of four

    samples algorithm! different $eed forwardmedian filter profiles back

    +. 9esai?E@ from GAA F2.IIMand nearest hori"ontal! propagation

    peopleneighborhood vertical and neural network interpolation two

    algorithm! skew diagonals

    correction

    $ourdirectional

    Canny method! local feature,d.*aidur 2IAA using thinning vector by

    C+ and *7, KD.EMet.al.?I@ numerals and dilation kirsch mask

  • 7/25/2019 ing Various Techniques

    9/11

    algorithm and oneglobal feature

    vector

    CONCLUSION

    The ma1or approaches used in the field of handwritten character recognition during the last decade have

    been reviewed in this paper. 9ifferent preprocessing! segmentation! feature e#traction! classification techniques are

    also discussed. Though! various methods for treating the problem of hand written -nglish letters have developed in

    last two decades! still a lot of research is needed so that a viable software solution can be made available. The

    e#isting CR for handwritten has very low accuracy. :e need an efficient solution to solve this problem so that

    overall performance can be increased.

    REFERENCES

    ?2@ Rafael ,. . Cru"! 5eorge 9. C. Cavalcanti and Tsang 'ng Ren ;+n -nsemble Classifier $or ffline Cursive Character

    Recognition 6sing ,ultiple $eature -#traction Techniques< '--- DA2A.

    ?D@ +nshul ,ehta! ,anisha *rivastava! Chitralekha ,ahanta ;ffline handwritten character recognition using neural

    network< '--- DA22 'nternational conference on computer applications and 'ndustrial -lectronics.

    ?G@ ivind 9ue Trier!+nil > Nain and Torfinn Te#t ;feature e#traction methods for character recognition a survey< 2KKI

    ?@ Nayashree R rashad! 9r. 6 7 kulkarni ;Trends in handwriting recognition< '--- DA2A! Tthird international conference

    on emerging trends in engineering and technology.

    ?E@ Christopher -. 9unn and . S. . :ang ;Character *egmentation techniques for handwritten te#t+ *urvey< '---

    2KKD.

    ?I@ +dnan +min and :. H. :ilson ;Handrinted Character Recognition *ystem 6sing +rtificial 3eural 3etworks< '---

    2KKG.

    ?L@ uttipong ,ahasukhon! Hossein ,ousavine"had! Neong/oung *ong ;Handrinted -nglish Character Recognition

    based on $u""y Theory

  • 7/25/2019 ing Various Techniques

    10/11

    ?F@ /uk /ing Chung! ,an to :ong handwritten character recognition by $ourier descriptors and neural network< '---

    2KKL! *peech and 'mage Technologies for computing and telecommunication.

    ?K@ *habana ,ehfu"! 5auri katiyar ;'ntelligent *ystems for ff%ine Handwritten Character Recognition& + Review