Multi-category text detection system and bill form detection method based on system
A technology of text detection and detection method, which is applied in the direction of neural learning method, character recognition, character and pattern recognition, etc. It can solve the problems of inaccurate bounding box, influence of results, uneven heat map, etc., and achieve high detection accuracy and prediction The effect of simple process and strong generalization ability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0051] The first embodiment of the present invention provides a multi-category text detection system, which includes: an image acquisition module for acquiring an image of the bill form to be detected, and a feature extraction for extracting multi-scale features of the bill form image to be detected Module, which is used to fuse the multi-scale features extracted by the feature extraction module and pass them to the pyramid bridge module of the decoding module, and is used to decode the fusion features through three branches to generate classification maps, center point heat maps and distance map decoding modules respectively .
[0052] The feature extraction module is also called the backbone network, which is responsible for transforming the original image into high-dimensional features, and is composed of a classic convolutional neural network structure; the pyramid bridging module is to output each layer of the backbone network through the PA module, and convert the feature...
Embodiment 2
[0059] The second embodiment of the present invention provides a bill form detection method, which is based on the multi-category text detection system in the first embodiment above, as figure 2 As shown, it includes the following steps:
[0060] In the first step, the preprocessed pictures are input into the multi-category text detection system to generate a center point map, a category map and a distance map respectively.
[0061] As a preferred implementation, the preprocessing includes scaling the picture to a fixed size (512*512) and normalizing it, and then inputting it into the multi-category text detection system. The multi-category text detection system has three outputs, which are category map (size 64*256*256), center point probability map (size 2*256*256), and distance map (size 8*256*256).
[0062] In the second step, center point positioning, the center point is found in the center point map based on the extreme point detection method, so as to determine the po...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com