Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Corpus labeling method and device, server and storage medium

A corpus labeling and corpus technology, applied in the field of information processing, can solve problems such as single corpus labeling results, cognitive level and operating habits affecting the quality of corpus labeling, and difficulty in judging the accuracy of labeling results, so as to ensure high quality and accuracy Effect

Active Publication Date: 2019-07-30
CHINANETCENT TECH
View PDF13 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the inventors found that there are at least the following problems in the related technologies: in the traditional corpus labeling method, a single corpus is generally marked by a single labeler, and the cognitive level and operating habits of the labeler greatly affect the quality of labeling the corpus. As a result, the labeling results of the corpus are single, and it is difficult to judge the accuracy of the labeling results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Corpus labeling method and device, server and storage medium
  • Corpus labeling method and device, server and storage medium
  • Corpus labeling method and device, server and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, various implementation modes of the present invention will be described in detail below in conjunction with the accompanying drawings. However, those of ordinary skill in the art can understand that, in each implementation manner of the present invention, many technical details are provided for readers to better understand the present application. However, even without these technical details and various changes and modifications based on the following implementation modes, the technical solution claimed in this application can also be realized. The division of the following embodiments is for the convenience of description, and should not constitute any limitation to the specific implementation of the present invention, and the various embodiments can be combined and referred to each other on the premise of no contradiction.

[0023] The first embodi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention relates to the technical field of information processing, in particular to a corpus labeling method and device, a server and a storage medium. The corpus annotation method comprises the steps of acquiring an even number of manual annotation results of an initial corpus and a model annotation result of the initial corpus, wherein the model annotation result of the initial corpus is obtained according to a preset annotation model, and the preset annotation model is obtained by training a plurality of manually annotated initial corpora; and obtaining a unique labeling result meeting a preset condition from all labeling results including the manual labeling result and the model labeling result, and taking the labeling result as a final labeling result of the initial corpus. According to the embodiment of the invention, the high-quality corpus annotation result of the initial corpus can be obtained, and the influence of a single annotator on the corpus annotation quality is reduced.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of information processing, and in particular to a corpus tagging method, device, server and storage medium. Background technique [0002] Natural language processing refers to the computer receiving input in the form of natural language, and internally processes and calculates the natural language through user-defined algorithms to return the results expected by the user. It can usually be applied to fields such as text retrieval, machine translation, and information question answering. Users usually define the algorithm by establishing an algorithm model, and the established algorithm model needs to be trained through a large number of labeled original language materials; labeling the original language materials refers to processing the original corpus, and combining various representations of language features The additional codes are marked on the corresponding language component...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06K9/62
CPCG06F16/3344G06F18/10G06F18/214
Inventor 宣劭文李金锋
Owner CHINANETCENT TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products