Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text recognition method, device and system

A text recognition and text data technology, applied in neural learning methods, character and pattern recognition, instruments, etc., can solve problems such as difficulty in determining thresholds, low recognition accuracy, and difficulty in raising, and achieve low recognition accuracy. , improve the recognition accuracy, improve the effect of generalization ability

Pending Publication Date: 2020-06-16
ALIBABA GRP HLDG LTD
View PDF5 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, because the detection target of the first recognition method is often very ambiguous, the words used are not significantly different from the words used in general texts, so it is difficult to propose very representative words and determine the corresponding threshold; the second recognition method Whether the method is to manually extract features or use deep learning to extract features, some deviations will be introduced due to the particularity of the existing training data set, which will greatly affect the generalization ability of the model.
[0004] Aiming at the problem of low recognition accuracy of text recognition methods in related technologies, no effective solution has been proposed so far

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text recognition method, device and system
  • Text recognition method, device and system
  • Text recognition method, device and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0024] According to an embodiment of the present invention, an embodiment of a text recognition method is also provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and, Although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0025] The method embodiment provided in Embodiment 1 of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. figure 1 A hardware structural block diagram of a computer terminal (or mobile device) for realizing the text recognition method is shown. Such as figure 1 As shown, the computer terminal 10 (or mobile device 10) may include one or more (shown by 102a, 102b, ..., 102n in the figure) processor 102 (the processor 102 may include but not limited...

Embodiment 2

[0061] According to an embodiment of the present invention, a text recognition device for implementing the above text recognition method is also provided, such as Figure 4 As shown, the device 400 includes: a first acquisition module 42 , a second acquisition module 44 and an identification module 46 .

[0062] Wherein, the first acquisition module 42 is used to acquire text data; the second acquisition module 44 is used to acquire the word vector corresponding to the text data; the recognition module 46 is used to identify the word vector by using the recognition model to obtain the recognition result of the text data, wherein , the recognition model is used to identify whether there is illegal content in the text data, and the recognition model is obtained through confrontation training.

[0063] Specifically, the above-mentioned text data may be text data of literary works that need to be identified for pornographic content. In this embodiment of the application, the text ...

Embodiment 3

[0083] According to an embodiment of the present invention, a text recognition system is also provided, including:

[0084] processor; and

[0085] The memory, connected to the processor, is used to provide the processor with instructions for processing the following processing steps: obtaining text data; obtaining word vectors corresponding to the text data; using the recognition model to recognize the word vectors to obtain recognition results of the text data, wherein, The recognition model is used to identify whether there is illegal content in the text data, and the recognition model is obtained through confrontation training.

[0086] Based on the solution provided by the above-mentioned embodiments of the present application, after the text data is obtained, the word vector corresponding to the text data is first obtained, and the word vector is further identified by the recognition model to obtain the recognition result of the text data, so as to achieve the purpose of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text recognition method, device and system. The method comprises the steps of obtaining text data; obtaining a word vector corresponding to the text data; and recognizing theword vector by using a recognition model to obtain a recognition result of the text data, the recognition model being used for recognizing whether illegal content exists in the text data, and the recognition model being obtained through adversarial training. According to the invention, the technical problem of low recognition accuracy of a text recognition method in the prior art is solved.

Description

technical field [0001] The present invention relates to the field of natural language processing, in particular to a text recognition method, device and system. Background technique [0002] At present, there are some pornographic plots in some literary works, especially online literary works. These pornographic plots will greatly damage the physical and mental health of young readers, and will also make most adult readers feel uncomfortable. Therefore, how to accurately identify the pornographic fragments in the novel is very important to realize the rectification of pornographic novels. However, under the influence of the existing review mechanism, some authors gradually discard obscene words that can be clearly identified by the review mechanism in the pornographic fragments of their novels, and instead use a large number of language techniques similar to metaphors, using frequently used words words to describe erotic scenes. This greatly affects the accuracy of the re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/289G06F40/30G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06F18/24
Inventor 贺国秀康杨杨高喆孙常龙刘晓钟司罗
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products