Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for clustering network files

A network file and clustering technology, applied in text database clustering/classification, special data processing applications, unstructured text data retrieval, etc., can solve problems that cannot be realized at the same time, low efficiency of clustering methods, and inability to cluster results Automatically correlate problems, etc., to reduce interference, improve efficiency, and improve accuracy

Inactive Publication Date: 2009-03-18
NEC (CHINA) CO LTD
View PDF1 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, existing techniques cannot achieve both clustering results at the same time
Although the clustering results can be successfully obtained, the two clustering results cannot be automatically correlated
Therefore, on the whole, the clustering methods of the prior art are less efficient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for clustering network files
  • Method and system for clustering network files
  • Method and system for clustering network files

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] Exemplary embodiments according to the present invention are described below with reference to the accompanying drawings. It should be appreciated that the described embodiments are for illustrative purposes only and that the invention is not limited to the specific embodiments described.

[0032] FIG. 1 is a block diagram showing a network file clustering system 100 according to a first embodiment of the present invention. As shown in the figure, the system 100 includes an input device 101 , a collection device 102 , an extraction device 103 , an output device 104 and a network file repository 105 . The system 100 uses the input device 101 to obtain a plurality of network files from the network file library 105, and after a series of processing, outputs the clustering results of the network files and the hierarchical relationship between each cluster from the output device 104. The network file repository 105 can store a collection of network files obtained from the n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for network file clustering and a system thereof. The method includes inputting a plurality of network files, collecting link relationship and a directory structure of the network files, extracting a hierarchical structure of the network files according to the link relationship and the directory structure, further, outputting one or multiple clusters for the network files based on the hierarchical structure. In some embodiments, hierarchical relationship among the clusters can be output simultaneously. Compared with the prior art, the method for network file clustering can greatly increase accuracy and efficiency of network file clustering.

Description

technical field [0001] The present invention relates to Web information extraction and mining technology, more specifically, to a method and system for network file clustering (cluster). Background technique [0002] Today, the World Wide Web (WWW) has become a popular and important medium for distributing and obtaining information. in visible. Web information extraction and mining technology can help people maximize the use of Web and information. In fact, Web information extraction and mining has become a very popular research field, and application software and products based on these technologies are becoming more and more popular in the market. [0003] Document clustering is a common information mining technique for discovering similarities and relationships between documents. The purpose of file clustering is to organize files into several meaningful groups, so that files in the same group have high similarity or close relationship, while files belonging to differe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F17/30707G06F17/30882G06F17/30705G06F16/353G06F16/9558G06F16/35
Inventor 赵彧李建强
Owner NEC (CHINA) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products