Text data optimization method for small sample intention recognition

A text data and optimization method technology, applied in the field of intent recognition of natural language processing, can solve the problem that meta-learning cannot directly parallelize training, and achieve the effects of avoiding insufficient learning, reducing time cost, and avoiding low utilization of training data

Pending Publication Date: 2022-05-31
NANJING UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0016] Purpose of the invention: The technical problem to be solved by the present invention is the problem that meta-learning cannot directly parallelize training when the number of training task intentions for small-sample intention recognition in real scenes is inconsistent

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text data optimization method for small sample intention recognition
  • Text data optimization method for small sample intention recognition
  • Text data optimization method for small sample intention recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0067] Such as figure 1 As shown, the present invention provides a text data optimization method for small sample intent recognition, including:

[0068] Step 1, construct a training text dataset. Define the training text dataset as S={T 1 ,T 2 ,...,T n}, where T i is a small-sample intent recognition task in a real dialogue scene, and n is the total number of training tasks in S. Each small sample intent recognition task is defined as where Intent ij for T i Corresponding to an intent in the dialogue scene, C i for T i The number of intents contained. In particular, for different T p and T q , which respectively correspond to the number of intentions C contained in p and C q Not necessarily the same. For each task an intent is defined as where quxery ijk To be marked as Intent ij A dialogue text of N ij for intent Intent ij The total number of annotation texts contained is shown in Table 1:

[0069] Table 1

[0070]

[0071] Among them, "mobile appl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text data optimization method for small sample intention recognition. The text data optimization method comprises the following steps: step 1, constructing a training text data set; 2, grading the training tasks in a stepped manner according to the intention number of the training tasks; step 3, sampling a batch of small sample intention recognition training samples; step 4, using more than two sampled tasks in the same batch to perform parallel training on the meta learning model based on metric learning; 5, judging whether the training is terminated or not; and step 6, ending model training. The parallel training element learning model can be applied to the situation that the intention number of training tasks for small sample intention recognition in a real scene is inconsistent.

Description

technical field [0001] The invention belongs to the field of intent recognition of natural language processing, in particular to a text data optimization method for small sample intent recognition. Background technique [0002] Intent Detection is one of the key technologies for constructing a human-computer interaction dialogue system (Dialogue System). The so-called intent recognition means that the computer program can analyze and process the semantic information contained in the input dialogue text, and finally judge the intent category it belongs to. [0003] In the current popular human-computer dialogue system platform, users are often required to create new dialogue tasks and provide relevant annotation data. However, due to the high cost of data annotation, many users cannot provide a large amount of annotation text, and each intent often only has a dozen or even a few samples. In this case, the intent recognition task belongs to Few-shot Intent Detection, which n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F16/33G06F16/338G06N3/04G06N3/08
CPCG06F16/3329G06F16/3344G06F16/338G06N3/08G06N3/045
Inventor 张建兵刘书豪黄书剑戴新宇陈家骏
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products