A method for capturing network data and a network data capturing scheduling device

A network data and web crawler technology, applied in the field of big data, can solve the problem of low efficiency of network data capture

Inactive Publication Date: 2019-01-15
ZHONGKE DINGFU BEIJING TECH DEV
View PDF8 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, the purpose of this application is to provide a method and device for capturing network data to solve the problem of low efficiency of network data capture in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for capturing network data and a network data capturing scheduling device
  • A method for capturing network data and a network data capturing scheduling device
  • A method for capturing network data and a network data capturing scheduling device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only It is a part of the embodiments of this application, not all of them. The components of the embodiments of the application generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely represents selected embodiments of the application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application provides a method for capturing network data and a network data capturing scheduling device. The method comprises the following steps: when the grabbing strategy contained in the stored task is triggered, classifying the website information contained in the task into website types, and obtaining the website crawler program identifier mapped by each classified website type according to the mapping relationship between the website type and the web crawler program identifier preset in advance; web crawler identification for each taxonomic site type mapping, determining thestored network crawler identifies a corresponding network crawler requesting network data capture. The web site information corresponding to the classified web site type is assigned to a determined web crawler program for network data capture, wherein the web crawler program corresponding to the same web crawler program identification is installed on different web data capture servers. It can effectively improve the efficiency of network data capture.

Description

technical field [0001] The present application relates to the technical field of big data, in particular, to a method for capturing network data and a network data capturing and scheduling device. Background technique [0002] Data is everywhere. With the development of Internet technology, information and knowledge are growing explosively, and the amount of data contained in each website on the network is also increasing. This makes it possible to use the web crawler program to grab the network data of various websites and obtain massive data for data mining, which promotes the rapid development of big data platforms such as artificial intelligence, Internet of Things, social networking, and search that require massive data support. The big data platforms use their respective network data capture servers to capture corresponding network data. [0003] Due to the different types of websites, the network data types of different websites are also different. Therefore, for ea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/953
Inventor 杨文提张中辉张剑波张瑞飞
Owner ZHONGKE DINGFU BEIJING TECH DEV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products