Data clustering method and device

A clustering method and data technology, applied in database models, relational databases, electronic digital data processing, etc., can solve problems such as poor clustering effect, and achieve the effects of improving initial parameters, improving stability, and improving accuracy.

Inactive Publication Date: 2017-05-31
SHANGHAI XIAOI ROBOT TECH CO LTD
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of this, the embodiment of the present invention provides a data clustering method and a clustering device, which are used to solve the technical problem of poor clustering effect affected by the initial conditions in the existing problem clustering process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data clustering method and device
  • Data clustering method and device
  • Data clustering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0022] The step numbers in the drawings are only used as reference signs for the steps and do not indicate the order of execution.

[0023] figure 1 It is a flowchart of an embodiment of the data clustering method of the present invention. Such as figure 1 shown, including:

[0024] Step 100: Obtain data to be processed, the data to be processed includes test data and non-test data.

[0025] In the data clustering method of an embodiment of the pr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data clustering method and device in order to solve the technical problem of poor clustering effect caused by influence of initial conditions in an existing question set clustering process. The data clustering method comprises steps as follows: to-be-processed data is acquired and comprises testing data and non-testing data; the testing data is subjected to first classification processing, and a first classification result is obtained; the testing data is subjected to second classification processing by adopting an initial preset value, and a second classification result is obtained; the second classification result and the first classification result are compared, and when the accuracy rate of the second classification result obtained by utilizing the first classification result as the standard is larger than or equal to a threshold value, the initial preset value is set as a target preset value; when the accuracy rate is smaller than the threshold value, the initial preset value is adjusted constantly until the accuracy rate of a new second classification result obtained when the initial preset value is adjusted to be the target preset value is larger than or equal to the threshold value; the non-testing data is subjected to the second classification processing by adopting the target preset value.

Description

technical field [0001] The invention relates to a data processing method and device, in particular to a corpus data processing method and device. Background technique [0002] In the field of automatic question answering of language processing, it is necessary to determine the questions with language as the carrier, and then establish the corresponding relationship between questions and answers, and establish a question set of similar questions, that is, the aggregation of question sets is to determine the "question-answer" business logic The basic technology and important steps. [0003] In the process of aggregation processing of question sets, the prior art adopts automatic clustering, and clusters similar question sentences to form different question sets. In the process of clustering, it is necessary to determine the number and initial position of the cluster centers to reflect the inter-class dissimilarity of the cluster centers. Then the iterative process of cluster...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/285
Inventor 谢瑜张昊朱频频
Owner SHANGHAI XIAOI ROBOT TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products