Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost

A cost, Chinese character technology, applied in the field of character recognition, can solve problems such as unreliability

Inactive Publication Date: 2006-01-11
TSINGHUA UNIV
View PDF0 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Chinese characters have a very rich structure, and quite a few of them have a left-right font structure. Under the writing habit from left to right, an unavoidable problem in offline Chinese character segmentation is that Chinese characters with left-right structure are often separated, such as "village" The word "village" is divided into "wood inch". For such segmentation, the recognizer often gives a very large confidence level, so the method of relying solely on the confidence level of the classifier is also unreliable.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost
  • Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost
  • Off-line hand writing Chinese character segmentation method with compromised geomotric cast and sematic discrimination cost

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0230] The present invention is characterized in that: it is realized by means of an image acquisition device and a computer connected to it, and contains the following implementation steps successively:

[0231] Step 1: Collect enough training samples for the following purposes through image acquisition equipment, and establish a corresponding library

[0232] ●Single character image sample library of offline handwritten Chinese characters;

[0233] ●The line image sample library with correct segmentation of characters has been given, see figure 1 a. We calibrate the correct segmentation method for the line image samples that have been extracted in advance, and then divide them into two parts, one part is used as a training sample for calculating parameters, and the other part is used as a test sample for testing the results of this application. the performance of the method;

[0234] The corpus of the fields involved in the objects to be segmented;

[0235] Step 2: Param...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The geometric cost-semantic identification cost amalgamated off-line hand-written Chinese character segmentation method belongs to the field of character identification technology. Said method includes the following steps: firstly, analyzing row image of inputted off-line hand-written Chinese character, extracting stroke segment, combining stroke segments into subcharacter block, at the same time giving out subcharacter blocks combined geometric cost, utilizing these geometric costs to generate several possible candidate segmentation methods, evaluating every method by using binary syntactic model of language to obtain semantic-identification cost of every segmentation method, finally integrating geometric and semantic identification costs to obtain optimum segmentation identification scheme.

Description

technical field [0001] An off-line handwritten Chinese character segmentation method based on the fusion of geometric cost and semantic-recognition cost belongs to the field of character recognition. Background technique [0002] Optical Character Recognition (OCR) technology has always been a hot issue in pattern recognition, and Chinese character OCR technology for Chinese character recognition has a development history of more than 20 years. The recognition of offline Chinese characters refers to the recognition of Chinese character images obtained by scanners, digital cameras or cameras ( image 3 ), the recognition of offline handwritten Chinese characters has always been a difficulty. This is because people's natural writing has a relatively large degree of freedom, and it cannot provide more additional information like online handwritten Chinese characters. [0003] A key issue involved in offline Chinese character recognition is character segmentation. This is becau...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62
Inventor 丁晓青蒋焰付强刘长松彭良瑞方驰
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products