Method and device for adding text label
A text and sequence tagging technology, applied in the field of computer science, can solve problems such as inability to apply, cannot fully solve the problem of adding punctuation marks, and achieve the effect of solving the problem of adding
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0044] refer to figure 1 , which shows a flow chart of Embodiment 1 of a method for adding text annotations in this application, which may specifically include the following steps:
[0045] Step 101. Obtain unlabeled text.
[0046] In a specific implementation, the non-marked text may be text without punctuation such as speech-recognized text or machine-translated text. Speech recognition technology is a technology that allows machines to convert voice signals into corresponding text or commands through the process of recognition and understanding. The text obtained by speech recognition technology has no punctuation marks. Reasonable addition of punctuation marks is necessary to improve the user's reading experience and help users quickly understand the text content.
[0047] Step 102: Process the unlabeled text by using the sequence labeling model trained in advance using the neural network model to obtain the sequence labeling of the non-labeled text.
[0048] In a spec...
Embodiment 2
[0058] This embodiment includes steps 101, 102, and 103 in Embodiment 1, and the specific implementation manner is the same as that of Embodiment 1, and will not be repeated here. For the training process of the sequence labeling model adopted in step 102 in this embodiment, refer to figure 2 , which may specifically include the following sub-steps:
[0059] Sub-step 201, acquire text samples with correct annotations.
[0060] In a specific implementation, texts with correct annotations can be obtained from the Internet or books.
[0061] Sub-step 202, perform serialization processing on the text samples with correct annotations to obtain unlabeled text samples and serial annotation samples.
[0062] This step is to remove annotations from the text with correct annotations to obtain non-annotated texts; and then convert the non-annotated texts into sequence annotations according to the position and type of text annotations in the texts with correct annotations. For details...
Embodiment 3
[0097] refer to Figure 4 , which shows a structural block diagram of Embodiment 3 of an apparatus for adding text annotations in this application, which may specifically include the following modules:
[0098] An unlabeled text acquiring module 401, configured to acquire unlabeled text.
[0099] The sequence annotation generation module 402 is configured to process the unlabeled text by using the sequence annotation model trained in advance using the neural network model to obtain the sequence annotation of the unlabeled text.
[0100] In a preferred embodiment of the present application, the neural network model may be an LSTM neural network model or a GRU neural network model.
[0101] In a preferred embodiment of the present application, when the neural network model is an LSTM neural network model, the LSTM neural network model may be a multi-layer LSTM neural network model, or a bidirectional LSTM neural network model.
[0102] A text annotation adding module 403, conf...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com