Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Web page information processing method, system, electronic device and storage medium

An information processing method and webpage information technology, which is applied in the fields of systems, webpage information processing methods, electronic equipment and storage media, can solve the problems that analysis modules and processing modules cannot be flexibly configured and modified, so as to save analysis and processing resources and improve The effect of processing efficiency

Active Publication Date: 2020-12-01
江苏运满满信息科技有限公司
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In view of this, the present application provides a webpage information processing method, system, electronic device and storage medium to solve the problem that the parsing module and processing module cannot be flexibly configured in the prior art, and the source code must be modified every time a new crawling requirement is added. question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Web page information processing method, system, electronic device and storage medium
  • Web page information processing method, system, electronic device and storage medium
  • Web page information processing method, system, electronic device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments, however, can be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this application will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar structures, and thus their repeated descriptions will be omitted.

[0038] The present application provides a web page information processing method, a corresponding system, a device and a storage medium, which can dynamically modify the configuration, have flexible parsing and processing, and save resources. The following uses Java language as an example to describe the web page information processing method of the present application. Those skilled in the art can also use other programmi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a webpage information processing method and system, electronic equipment and a storage medium, relates to the field of data processing, and is used for crawling and processing webpage resources. The webpage information processing method comprises the steps: acquiring a URL address to be crawled, and crawling webpage information according to the URL address; Calling an analysis module bound with the URL address in a database, analyzing the crawled webpage information, and extracting the data content of the webpage information; And calling a processing module bound with the URL address in a database, and processing the analyzed data content. According to the invention, the URL address is bound with the corresponding analysis module and processing module in a mutual mapping manner; when a crawling task of a certain URL address is executed, the URL address is transmitted in a process, so that a corresponding analysis module and a corresponding processing module are flexibly called from a database according to the URL address to process crawled webpage information; And the URL address and the corresponding analysis module and processing module can dynamically modify the configuration.

Description

technical field [0001] The present application relates to the technical field of data processing, and in particular, to a web page information processing method, system, electronic device and storage medium. Background technique [0002] Crawler technology is a technology that automatically crawls specific webpage resources according to certain rules. Since the webpages of each website are different, it is necessary to customize development for specific webpage resources when crawling different webpage resources. [0003] Existing crawler technology system reference figure 1 As shown, the dispatch center controls multiple crawling lines, each crawling line corresponds to crawling a webpage resource, and each crawling line includes 3 modules, namely the webpage downloading module, the webpage parsing module and the data processing module. The crawling lines are independent of each other. When the acquisition requirement of a web page resource is increased, a program for web...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/955
Inventor 曹功源
Owner 江苏运满满信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products