Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data mining device based on Deep Web deep dynamic data and method thereof

A technology of data mining and dynamic data, applied in electrical digital data processing, special data processing applications, instruments, etc., to achieve the effect of expanding data sources and information resources, simple and practical operation, and high data quality

Active Publication Date: 2010-09-22
TONGFANG KNOWLEDGE NETWORK TECH CO LTD (BEIJING)
View PDF6 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Other relevant existing technologies are relatively rare, and there are almost no similar technical solutions in the existing literature. In addition, most of the solutions are aimed at the collection system of ordinary webpage WEB (that is, shallow webpage) data, which is completely different from the mode of collecting deep webpage data. Different, its WEB (that is, shallow web page) data collection system diagram is as follows figure 1 shown

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data mining device based on Deep Web deep dynamic data and method thereof
  • Data mining device based on Deep Web deep dynamic data and method thereof
  • Data mining device based on Deep Web deep dynamic data and method thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] This embodiment provides a data mining device based on Deep Web (dark web) deep dynamic webpage data. The device includes at least one commercial server, the basic hardware configuration is 4CPU, 8G memory, 1T disk space, including pre-installed Windows 2003 / 2000 Server operating system and ASP.NET application server, three virtual operating systems and can be expanded for The deployment of the distributed collection system at the operating system level; with the distributed collection function at the server level, it can be extended to multiple commercial servers as required. At least three data storage servers, a database storage system centered on data storage and integration, pre-installed with a relational database system that supports mass storage and full-text indexing functions, such as Microsoft's SQLServer system. At least one data index server mainly stores index information of collected data, with the purpose of accelerating data integration, retrieval speed...

Embodiment 2

[0030] This embodiment provides a data mining method based on Deep Web deep dynamic data, see Figure 4 , the method includes the following steps:

[0031] Step 101 imports the feature word dictionary of collection;

[0032] Through the collection and release management platform, in the collection and simulation thesaurus management system, input the specific retrieval conditions of data mining or the thesaurus to be collected. The thesaurus can also be automatically created by the collector, or automatically imported and exported through the dictionary table.

[0033] Step 102 creates a data mining collection task;

[0034] Through the collection and release management platform, in the collection task scheduling management system, users can create data mining collection tasks through the navigation function of the system according to the preset requirements, as shown in Table 1. This process is relatively flexible, and data sources can be selected individually , format, se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data mining device based on Deep Web deep dynamic data and a method thereof. The device comprises a commercial server, a data storage server, a data index server and a file server; device systems based on the device comprise an acquisition simulative theme thesaurus management system, an acquisition task scheduling management system, an acquisition server and an acquisition storage scheduling system. The invention provides a dynamic data acquisition means with large quantity, high data quality, strong real-time property and easy deep analysis, and makes up the defect that the quantity and quality of the conventional search engine are all limited; and the invention has simple and practical operation, rich customization function and good expandability and robustness, and a user can customize, acquire and reestablish a management database according to the specific or strongly-monographic requirements, provide data utilization efficiency to great extent, and expand data source and information resource.

Description

technical field [0001] The present invention relates to a data mining device and method, in particular to a data mining device and method based on Deep Web (dark net) Internet deep dynamic data. Background technique [0002] For the field of enterprise competitive intelligence, users need to find useful or unfavorable information from a wide range of Internet fields, and it is difficult to achieve users' goals through ordinary search engines. One of the reasons is that it is difficult to obtain complete Second, search engines can only obtain static web page data, but cannot obtain dynamic data, nor can they obtain data through query interfaces such as search engines, let alone internal enterprise data or purchased commercial data. These data are Deep Web data. Moreover, static web page data only accounts for a very small part of the entire WEB data, which is far from meeting the needs of users. [0003] For the field of academic research, users hope that the wider the rang...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 张振海雷华平
Owner TONGFANG KNOWLEDGE NETWORK TECH CO LTD (BEIJING)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products