Marker file parsing method and device

A technology of marking files and parsing methods, applied in the field of data parsing, can solve the problem of low success rate of HTML web pages, and achieve the effect of improving the success rate of parsing

Active Publication Date: 2014-02-12
BEIJING QIHOO TECH CO LTD
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] The technical problem to be solved by this application is to provide a method and device for parsing markup files, so as to effectively solve the problem of low success rate when parsing HTML webpages in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Marker file parsing method and device
  • Marker file parsing method and device
  • Marker file parsing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0072] In order to make the above objectives, features and advantages of the application more obvious and understandable, the application will be further described in detail below in conjunction with the drawings and specific implementations.

[0073] At present, the use of markup languages ​​to describe or store data has become the most important data presentation and storage method, such as HTML, HTML5, eXtensible HyperText Markup Language (XHTML), and Extensible Markup Language (Extensible Markup Language, XML) etc. One of the most important features of this type of markup language is that they use a set of markup tags to organize or store data. The marked files described in this application below refer to files that organize data with marked tags.

[0074] Reference figure 1 , Shows a schematic flow chart of a method for parsing a marked file in this application, which is specifically as follows:

[0075] Step 101: Obtain label objects in the markup file to generate a label set....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a marker file parsing method and device which are used for solving the problem that the success rate in parsing a marker file is low in the prior art. The method includes the steps that a tag set is generated by acquiring tag objects in the marker file; the tag objects are grouped according to common attributes of the tag objects in the tag set; one or more tag groups are acquired from grouping results; a mapping table is parsed according to the preset marker file, and matching is carried out on attributes of the tag objects in the one or more tag groups; data for file parsing are acquired from matched tag groups. By grouping the tag objects according to the common attributes of the tag objects, a correlation is established between the original unordered tag objects in the marker file, further matching analysis is greatly facilitated, and the success rate in parsing the marker file is effectively improved.

Description

[0001] The present invention patent application is a divisional application of a Chinese invention patent application with the filing date of March 30, 2012, the application number being 201210091311.4, and the name "A method and device for analyzing marked documents". Technical field [0002] This application relates to the technical field of data analysis, and in particular to a method and device for analyzing markup files. Background technique [0003] At present, Internet technology has deeply affected people's lives, such as e-mail, forums, and web games have become an indispensable part of people's daily work and entertainment. However, most of the above Internet applications require users to register and log in before they can be used, so users need to memorize a large number of user names and passwords. For account security, users usually need to set a more complicated combination of numbers, letters, and special symbols, which further increases the difficulty of rememberi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/957G06F16/9577
Inventor 杭程李超万勇任寰
Owner BEIJING QIHOO TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products