Sample expansion method and device, electronic equipment and storage medium
A sample expansion and sample technology, applied in the field of sample expansion, can solve problems such as low efficiency, lack of intention recognition level, and customer intention recognition errors
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0100] Embodiment 1 of the present application provides a sample expansion method, which is applied to expand speech samples for training voice robots to recognize customer intentions, which will be described in detail below with reference to the accompanying drawings.
[0101] see figure 1 , which is a flowchart of a sample expansion method provided in Embodiment 1 of the present application.
[0102] This method can expand new samples based on samples with known labels, and the labels of the new samples obtained from the expansion are known, and can also determine labels for samples with unknown labels based on samples with known labels. It can be understood that the samples actually used to train the recognition model are samples with known labels.
[0103] The method described in the embodiment of the present application includes the following steps:
[0104] S101: Determine an original sample from N known labeled samples, and perform word segmentation processing on the ...
Embodiment 2
[0130] The method for screening the samples to be verified will be described in detail below in conjunction with the accompanying drawings.
[0131] see figure 2 , which is a flow chart of the method for screening samples to be verified provided in Embodiment 2 of the present application.
[0132] The method described in the embodiment of the present application includes the following steps when screening the sample to be verified:
[0133] S201: Obtain word vectors of the original sample and the i-th sample to be verified; said i=1, . . . K.
[0134] In the embodiment of the present application, the screening of samples to be verified includes similarity screening and perplexity screening. When performing similarity screening, it is first necessary to obtain word vectors of samples to be verified.
[0135] The text collection of the vertical domain corpus is used for word segmentation training in advance to generate a vector model and a language model. Among them, the vec...
Embodiment 3
[0189] The method for screening samples whose labels can be determined from M samples with unknown labels based on N known label samples will be described in detail below with reference to the accompanying drawings.
[0190] see Figure 8 , which is a flow chart of another sample expansion method provided in Embodiment 3 of the present application.
[0191] The method described in the embodiment of the present application includes the following steps:
[0192] S301: Obtain the similarity between the jth unknown label sample and the N known label samples; j=1, . . . M.
[0193] In a possible implementation manner, for the jth sample with an unknown label, the similarities between it and the N samples with known labels are respectively obtained, that is, N similarities are obtained. Therefore, M samples with unknown labels need to obtain M×N similarities.
[0194] Firstly, the word vectors of the M unknown label samples and the N known label samples are obtained.
[0195] Th...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com