Method and device for obtaining similar object set and providing similar object set

An object collection, object technology, applied in instrumentation, computing, electrical digital data processing, etc., can solve the problem of high computational complexity

Active Publication Date: 2015-03-18
ALIBABA GRP HLDG LTD
View PDF2 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

That is to say, for the same object, N×C traversals are required, and the computational complexity is high, especially in high-dimensional spaces, the performance loss may be unbearable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for obtaining similar object set and providing similar object set
  • Method and device for obtaining similar object set and providing similar object set
  • Method and device for obtaining similar object set and providing similar object set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0102] The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments in this application belong to the protection scope of this application.

[0103] In order to facilitate the understanding of the embodiment of the present application, it should first be explained that in the LSH based on the extended Jaccard distance, after splitting the bitmap with a length of N×C according to the weight value of each attribute in the object, if strictly according to the Jaccard distance To compare the similarity between two objects, the calculation of intersection and union must be performed between any two objects, which will consume a lot of computing resources. Theref...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and device for obtaining similar object set and providing similar object set. The method comprises as follows: obtaining input file comprising M objects, N attributes, attribute values corresponding to each attribute; inputting each attribute to first level of pre-created minimum hash function minhash, obtaining the returned value of the first level of minhash of each attribute; according to each attribute, weighted value corresponding to the attribute in the current object and the second level of pre-created minhash function, obtaining the returned value of the second level of the minhash of each attribute; calculating the combined minhash value of each attribute in each object respectively; determining the minimum value of the combined minhash value corresponding to each attribute of the same object as the minhash value of the object; circularly executing the operation to each object for K times, respectively obtaining K minhash values in allusion to each object; inputting K minhash values of each object to the locality sensitive hashing (LSH) computing framework. The method and device are capable of improving the operating efficiency, and improving the validity and accuracy degree of the similar object information.

Description

technical field [0001] The present application relates to the technical field of object similarity calculation, in particular to a method and a device for obtaining a similar object set and providing similar object information. Background technique [0002] In the Internet industry, there are many applications that need to face the following core problems: Given a collection of objects T={t 1 ,t 2 ,...,t M}, for any element t in the set i , calculate the set T and t i All elements whose distance is less than a certain threshold. When calculating the distance between two objects, it is generally calculated based on the attribute information of the object. For example, for an object such as a product, its attributes can include category, color, style, etc. Rich attribute information generally requires Represented by a high-dimensional vector. [0003] There are many definitions for measuring distance scales. The commonly used ones are Jaccard distance, extended Jaccard d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/137
Inventor 陈俊波蔡维佳陈春明
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products