Resources Contact Us Home
Modified Levenshtein distance algorithm for coding

Image Number 8 for United States Patent #7664343.

Methods and systems of mapping of an optical character recognition (OCR) text string to a code included in a coding dictionary by supplementing the Levenshtein Distance Algorithm (LDA) with additional information in the form of adjustments based on particular character substitutions, insertions and deletions together with weighting based on multiple alternatives for the OCR text string. In one embodiment, an OCR text string mapping method (100) includes receiving (110) an OCR text string, comparing (120) it with selected text strings from a coding dictionary, computing (130) modified Levenshtein distances associated with the comparisons by determining (140) substitution penalties, determining (150) insertion penalties, determining (160) deletion penalties and combining (170) the penalties, selecting (180) the best matching text string from the coding dictionary based on the modified Levenshtein distances, determining (190) whether a maximum threshold distance is met, and assigning (200) a code associated with the best matching text string to the OCR text string when met, and assigning (210) a null or no code when not met.

  Recently Added Patents
Heat retaining device
Distylium plant named `PIIDIST-II`
Methods and systems for automatically identifying a logical circuit failure in a data network
Graphical communication user interface with graphical position user input mechanism for selecting a display image
Wireless communication system and wireless communication method
Correlating trace data streams
  Randomly Featured Patents
Hand tool
Behind seat storage compartment for trucks or the like
Body frame of a small-sized vehicle
Methods for detecting retroviruses
Heat-seal strength in polyolefin films
Audio-visual caption decoder
Linear RF power amplifier with optically activated switches
Magnesium-containing solid
Apparatus for selectively plating electrical terminals
Knee joint prosthesis