Automatic network data acquisition method

An automated network and data collection technology, applied in the field of network data, can solve problems such as inability to obtain data from third-party platforms, low retrieval frequency, and high labor costs, and achieve the effect of fast iterative data collection and improved accuracy

Pending Publication Date: 2022-04-12
陕西数图行信息科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, when acquiring data from 10 domestic and foreign industry sites, if we want to retrieve relevant content in a short time and update it to our site through traditional data collection and sorting methods, not only will the labor cost be large, but the retrieval speed will also be slow. The search frequency is also low, and the third-party platform data cannot be updated synchronously, and there may be errors in the manually collected data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic network data acquisition method
  • Automatic network data acquisition method
  • Automatic network data acquisition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations.

[0037] Such as figure 1 As shown, the present embodiment provides a method for automatic network data collection, the method comprising:

[0038] S1. Collect the network data disclosed by the third-party platform of domestic and foreign industry sites to obtain the original webpage.

[0039] S2. Extracting data from the original webpage to obtain the parsed webpage.

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of network data, and discloses an automatic network data acquisition method, which comprises the following steps: S1, acquiring network data to obtain an original webpage; s2, performing data extraction on the original webpage to obtain an analyzed webpage; s3, carrying out null removal, error removal, repetition removal, normalization and incomplete value supplement processing on the analyzed webpage to obtain processed data; s4, storing the processed data; and S5, processing the stored data. According to the automatic network data collection method, 24-hour uninterrupted collection can be carried out on data disclosed by a third platform, minute-level third-party platform data retrieval synchronization is supported, second-level updating can be achieved for data updating of increment parts of multiple sites, manual supervision is not needed, meanwhile, through keyword retrieval configuration, the data updating efficiency is improved, and the efficiency is improved. According to the method, irrelevant contents can be filtered out while automatic retrieval is realized, the accuracy is improved, and non-supervision, non-omission and rapid iterative data acquisition is realized.

Description

technical field [0001] The invention relates to the technical field of network data, in particular to an automatic network data collection method. Background technique [0002] Network data acquisition refers to the process of using Internet search engine technology to achieve targeted, industry-specific, and accurate data capture, and classify data according to certain rules and screening standards, and form a database file. [0003] At present, when acquiring data from 10 domestic and foreign industry sites, if we want to retrieve relevant content in a short time and update it to our site through traditional data collection and sorting methods, not only will the labor cost be large, but the retrieval speed will also be slow. The search frequency is also low, and the third-party platform data cannot be updated synchronously, and there may be errors in the manually collected data. Contents of the invention [0004] The purpose of the present invention is to provide an aut...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/951G06F16/955G06F16/27G06F16/2455G06F16/23G06F9/48
Inventor 武亚洲王治胜童曦
Owner 陕西数图行信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products