Corpus labeling method, device and equipment

A corpus labeling and corpus technology, applied in semantic analysis, natural language data processing, instruments, etc., can solve the problems of high labor cost and insufficient corpus.

Active Publication Date: 2021-04-23
北京水滴科技集团有限公司
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, this application provides a corpus tagging method, device and equipment, the main purpose of which is to solve the problems of high labor costs in the corpus tagging process and insufficient corpus in complex scenes in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Corpus labeling method, device and equipment
  • Corpus labeling method, device and equipment
  • Corpus labeling method, device and equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0070]The present application will be described in detail below with reference to the accompanying drawings. It should be noted that the features in the embodiments and embodiments in the present application may be combined with each other in the case of an unable conflict.

[0071]In the relevant technique, with artificial intelligence technology, the pre-training recognition model is used to use natural language processing technology, which can assist the identification of illegal terms and greatly improve the identification efficiency. However, in the use of natural language processing technology, you need to use a large number of corpistic cultural training identification models, the more corpus needed by the more complex semantics, and in actual application scenarios, the labeling process of a large number of corners requires a lot of human cost, improvement Technical costs, even many complex scenes are difficult to provide foot styles, making the identification model training res...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a corpus labeling method, device and equipment, relates to the technical field of artificial intelligence, and aims to generate text corpora of different violation types in batches and save corpus labeling time. The method comprises the steps of performing sentence segmentation processing on text data in different business scenes, and storing text corpora formed after sentence segmentation processing in a corpus database; dividing preset standard violation descriptions into different violation categories by taking the semantic points as units; according to the entity concepts contained in the semantic points and the logic relation between the entity concepts, building keyword semantic rules, and the keyword semantic rules being violation expressions mapped on different violation categories for standard violation description; and matching target text corpora containing different violation categories from the corpus database by utilizing the violation expression, and labeling the target text corpora based on the violation categories.

Description

Technical field[0001]The present application relates to artificial intelligence technology, in particular, involving a corpus labeling method, apparatus, and equipment.Background technique[0002]In order to promote enterprise sales, development of markets and improve customer satisfaction, companies usually use customer service centers to touch customers, the process will produce huge call records and chat records to monitor customer service quality, mainly for customer service to use violation terms Perform identification, for example, detect if the customer service staff uses a standard specification term, and detects whether the customer service staff promotes the product.[0003]The traditional artificial quality inspection efficiency is low, and the reproducibility of labor is currently using artificial intelligence technology. It uses natural language processing technology to perform pre-training recognition models, which can assist in identifying violations and greatly improve r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/169G06F40/211G06F40/295G06F40/30G06F16/33G06F16/332
CPCG06F16/3329G06F16/3344G06F40/169G06F40/211G06F40/295G06F40/30
Inventor 袁徐磊宋鑫肖鹏
Owner 北京水滴科技集团有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products