Resources Contact Us Home
Modified Levenshtein distance algorithm for coding

Image Number 8 for United States Patent #7664343.

Methods and systems of mapping of an optical character recognition (OCR) text string to a code included in a coding dictionary by supplementing the Levenshtein Distance Algorithm (LDA) with additional information in the form of adjustments based on particular character substitutions, insertions and deletions together with weighting based on multiple alternatives for the OCR text string. In one embodiment, an OCR text string mapping method (100) includes receiving (110) an OCR text string, comparing (120) it with selected text strings from a coding dictionary, computing (130) modified Levenshtein distances associated with the comparisons by determining (140) substitution penalties, determining (150) insertion penalties, determining (160) deletion penalties and combining (170) the penalties, selecting (180) the best matching text string from the coding dictionary based on the modified Levenshtein distances, determining (190) whether a maximum threshold distance is met, and assigning (200) a code associated with the best matching text string to the OCR text string when met, and assigning (210) a null or no code when not met.

  Recently Added Patents
Simulation tool for air traffic communications security
Systems and methods for managing and utilizing excess corn residue
Spectral sensor for checking documents of value
Single check memory devices and methods
Precision geolocation of moving or fixed transmitters using multiple observers
Solar powered charging shelter and system and method thereof
  Randomly Featured Patents
Applying a user profile in a virtual space
Signal detector and method for detecting signals having selected frequency characteristics
Soybean cultivar 0332132
Fender with compact spring element
Interconnected solder pads and the method of soldering
Bearing seal for rotating cutter drill bit
Valve plate having improved suction gas flow path
Globally observing load operations prior to fence instruction and post-serialization modes
Telephone communication system having an enhanced timing circuit
Video signal interpolator with peaking