Resources Contact Us Home
Modified Levenshtein distance algorithm for coding

Image Number 8 for United States Patent #7664343.

Methods and systems of mapping of an optical character recognition (OCR) text string to a code included in a coding dictionary by supplementing the Levenshtein Distance Algorithm (LDA) with additional information in the form of adjustments based on particular character substitutions, insertions and deletions together with weighting based on multiple alternatives for the OCR text string. In one embodiment, an OCR text string mapping method (100) includes receiving (110) an OCR text string, comparing (120) it with selected text strings from a coding dictionary, computing (130) modified Levenshtein distances associated with the comparisons by determining (140) substitution penalties, determining (150) insertion penalties, determining (160) deletion penalties and combining (170) the penalties, selecting (180) the best matching text string from the coding dictionary based on the modified Levenshtein distances, determining (190) whether a maximum threshold distance is met, and assigning (200) a code associated with the best matching text string to the OCR text string when met, and assigning (210) a null or no code when not met.

  Recently Added Patents
Method for detection and characterization of a microorganism in a sample using time-dependent intrinsic fluorescence measurements
Transcription factor
Headset electronics
Liquid crystal display apparatus
Liquid crystal display
In-vehicle communication system and method of operation
  Randomly Featured Patents
Intravascular stent having a coaxial polymer member and end sleeves
Weaving loom with motor-driven frames
Footwear upper
Optical couplers with thermoformed fibers
Modem pooling system
Virtual circuit management for multi-point delivery in a network system
Method of and apparatus for inspecting paint coating
Face sealing air actuated valve in an intermittent feed device
Image pickup device and method for enabling switching between a first recording medium and a second recording medium
Multibore conduit