Three-stage XML (X Extensible Markup Language) twig matching algorithm based on version trees

A matching algorithm and version technology, applied in the field of information retrieval, can solve the problems of post-order processing of returned results, low time performance of XML branch matching algorithm, etc., and achieve the effect of small algorithm input scale and good scalability

Inactive Publication Date: 2011-10-12
姚美玲
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention: Aiming at the problem that the time performance of the current XML branch matching algorithm is relatively inefficient and the returned results need post-processing, a three-stage XML branch matching algorithm based on the version tree is proposed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Three-stage XML (X Extensible Markup Language) twig matching algorithm based on version trees
  • Three-stage XML (X Extensible Markup Language) twig matching algorithm based on version trees
  • Three-stage XML (X Extensible Markup Language) twig matching algorithm based on version trees

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] In order to make the purpose, realization scheme and advantages of the present invention clearer, the present invention will be further described in detail below.

[0027] Step 1 is specifically performed as follows:

[0028] Each query node contains the following attributes: tag name, parent query node, collection of child query nodes, and relationship with the parent query node (ancestor-descendant relationship or parent-child relationship). Each query branch contains the following components: a root node, a collection of leaf nodes. The process of parsing query path expressions is also the process of parsing strings from left to right. According to the different strings encountered in the parsing process, it can be divided into three cases: (1) " / / " and " / " The relationship representing the query node to be parsed next is ancestor-descendant and parent-child relationship; (2) the label name represents the current query node name; (3) "[" and "]" represent the beginn...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a three-stage XML (X Extensible Markup Language) twig matching algorithm Twig3Version based on version trees, comprising the execution process which can be divided into three stages: the stage 1, structural matching: a proposed XML twig matching algorithm (TwigStack) executable on an original document is utilized to execute the structural matching of query twigs on a compressed index structure (version tree) in order to obtain sub-trees of all the version trees satisfying query twig structure constraint; the stage 2, version filtration: version numbers which do not satisfy the query twig structure constraint in the matched sub-trees returned in the stage 1 are filtered through the version numbers; and the stage 3, merger connection: the final matching twig is obtained by merging XML document elements corresponding to the version numbers returned in the stage 2. The algorithm utilizes the advantages of a simplified version tree structure, TwigStack algorithm and TJFast algorithm in a comprehensive way, and structural matching is executed on the simplified version trees and an efficient and simple version filtration module is executed on a simplified intermediate result, so the performance of a query algorithm is greatly enhanced. Extracting useful information from XML, which is a data form used extensively day by day, becomes an unavoidable problem. The method in the invention can help users rapidly extract user-interested information from a larger number of XML data sources.

Description

technical field [0001] The invention designs a version tree-based three-stage XML branch matching algorithm, which is mainly used in the field of information retrieval to help users quickly extract the information they are interested in from XML data sources. Background technique [0002] With the rapid development of network applications, a large number of data (called XML data) conforming to the XML specification have existed in the current information society, especially the further development of e-commerce, Web services, digital libraries and other application fields, making the XML type Data has become the current mainstream data form. How to conveniently extract information that users are interested in from a large amount of XML data has become an important topic in the research of XML data management. [0003] In most XML query languages ​​(such as XPath and XQuery), the structure of XML documents is represented as a tree model, and the values ​​of XML elements are ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 姚美玲
Owner 姚美玲
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products