Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

MHTML file analysis processing method and device, electronic equipment and medium

A technology for parsing and processing files, applied in the field of data processing, which can solve problems such as extracting text content and image information, identifying obstacles to data batch processing and analysis by appraisers, etc.

Pending Publication Date: 2022-05-10
北京网神洞鉴科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, customers often need to summarize and analyze the remote fixed data in the follow-up, and the characteristics of MHTML determine that the appraiser cannot directly extract the text content and picture information from the MHTML file by reading the file content, so that the appraiser can batch process the data and Analysis poses a certain obstacle

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • MHTML file analysis processing method and device, electronic equipment and medium
  • MHTML file analysis processing method and device, electronic equipment and medium
  • MHTML file analysis processing method and device, electronic equipment and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the present invention. Obviously, the described embodiments are part of the embodiments of the present invention , but not all examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0039] figure 1 It is a schematic flow chart of the method for parsing and processing MHTML files provided by the present invention, such as figure 1 As shown, the method includes:

[0040] S110, acquiring the MHTML file to be parsed;

[0041] S120. Obtain the content type of the target paragraph based on the content type tag of the target paragraph, and obtain the encryption method of the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an MHTML file analysis processing method and device, electronic equipment and a medium. The method comprises the steps that the content type of a target paragraph is obtained based on a content type label of the target paragraph, and the encryption mode of the target paragraph is obtained based on an encryption type label of the target paragraph; determining a data reading mode based on the content type, and reading data from the target paragraph by using the data reading mode; calling a preset library function based on the encryption mode, and decoding the read data by using the library function to obtain an analyzed result of the target paragraph; and based on the analyzed results of all the paragraphs in the MHTML file, obtaining the analyzed result of the MHTML file. According to the MHTML file analysis processing method and device, the electronic equipment and the medium provided by the embodiment of the invention, automatic batch processing of the MHTML files is realized, and the text content and the picture content are respectively obtained.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a method, device, electronic equipment and medium for parsing and processing MHTML files. Background technique [0002] In the process of electronic data identification, customers often encounter the need to fix the pages in a certain website. The usual method used by appraisers is to save the webpage locally in HTML format when the client only needs the text content in the webpage. The text content is saved to the local MHTML format file. If the customer needs to fix the complete webpage content, the appraiser generally saves the webpage manually as a single file in the format of MHTML. [0003] However, customers often need to summarize and analyze the remote fixed data in the follow-up, and the characteristics of MHTML determine that the appraiser cannot directly extract the text content and picture information from the MHTML file by reading the file content, so that...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/958G06F21/60G06F21/62
CPCG06F16/986G06F21/602G06F21/6218
Inventor 石文良张庆纲姚超徐志强
Owner 北京网神洞鉴科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products