Soft segmentation method of financial OCR system handwritten numerical strings

A technology of handwriting and digital strings, which is applied in character and pattern recognition, instruments, computer components, etc., can solve the problems of incomplete reflection of the nature of characters, unstable character description, and performance degradation of single classifiers for sensitivity and stability.

Inactive Publication Date: 2012-07-11
STATE GRID ELECTRIC POWER RES INST
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The combination of multiple classifiers is an effective way to design a high-performance and stable handwritten digit recognizer, which overcomes the three types of reasons that lead to poor performance of the classifier to a certain extent: single feature does not fully reflect the nature of characters; noise and other factors Influence, leading to the instability of feature description for characters; Different types of classifiers have different sensitivities and stability to feature changes, resulting in a decline in the performance of a single classifier

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Soft segmentation method of financial OCR system handwritten numerical strings
  • Soft segmentation method of financial OCR system handwritten numerical strings
  • Soft segmentation method of financial OCR system handwritten numerical strings

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014] (1) Image preprocessing

[0015] Because noise interference is inevitable in the image acquisition process, it is easy to cause some isolated small blocks or irregular jagged edges and noises on the edges of characters, so the character images binarized by Otsu cannot be used directly. For the burrs, depressions and isolated noise points on the character strokes, first filter the image based on the average stroke width of the character image, but avoid smoothing out the thin strokes, and then scan the binary character image line by line, and based on The details of the image structure within a given size window modify the central pixel value in order to remove the burrs on the strokes, fill the depressions or inner holes on the strokes, and suppress or eliminate the influence of noise on character segmentation.

[0016] The size of the window template used is 3×3 (such as figure 2 (a)~(e)), where template T0 is used to remove isolated noise points; templates T1~T4 (re...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a soft segmentation method of financial OCR system handwritten numerical strings. Automatic recognition processing of financial paper realizes automatic input and rechecking of financial paper, seamlessly integrates the whole process of image processing, layout analysis and intelligent identification, and comprises automatic classification of financial paper images, image preprocessing of the financial paper, identification, supervision and checking of elements in the financial paper, and the like. OCR technology is the core part in a financial paper automatic recognition processing system, requires cutting connected character strings required for automatic processing of financial paper elements into single characters, and performing character recognition. The character recognizer at the present stage has high accuracy rate, so that the overall recognition rate of the OCR system depends on the accurate rate and the acceptability of character string segmentation. The method aims at solving the technical problem of a soft segmentation method for realizing connected numeric strings based on a fuzzy pattern recognition theory, so as to improve the accurate rate of the integral segmentation flow and reduce the false rejection rate of the system, and improve the overall performance of the recognition system.

Description

technical field [0001] The invention belongs to the technical field of OCR, and relates to a method for segmenting handwritten character strings. The method can extract fuzzy features from strokes in digital images and map them to feature fragment sets, and then form candidate segmentation hypotheses and Calculate the optimal segmentation result. Background technique [0002] At present, OCR has become the core technology of various systems, and has been widely used and involves many industries from finance, government, libraries to electric power, enterprises and institutions, such as document image recognition systems (including document entry, search, management, etc.) , Office automation text input, postal code automatic sorting system, document automatic classification system, license plate automatic recognition system, bill automatic processing system, etc. A complete OCR system generally requires the following steps: adjustment of tilted images, layout analysis and l...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/34
Inventor 丁杰彭林朱力鹏胡斌
Owner STATE GRID ELECTRIC POWER RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products