Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Correctness verification method and system of suffix array

A technology of correctness verification and suffix array, which is used in electrical digital data processing, special data processing applications, natural language data processing and other directions to achieve the effect of reducing time and space overhead

Active Publication Date: 2017-08-04
SYSU CMU SHUNDE INT JOINT RES INST +1
View PDF6 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

As the size of the data set continues to grow, the correctness verification time of the suffix array will even exceed the construction time, and the existing verification methods are no longer fully applicable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Correctness verification method and system of suffix array
  • Correctness verification method and system of suffix array

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] figure 1 It is a schematic flowchart of a correctness verification method of a suffix array in an embodiment. Such as figure 1 As shown, a correctness verification method of a suffix array, including:

[0027] S101: Scan T once from right to left, compare the size of the currently scanned character T[i] and the subsequent character T[i+1] according to the definition of the suffix type, and calculate the character T[i] and suf(T,i) of T ), recorded in t[i];

[0028] S102: Scan T from left to right to find out the positions where all LMS characters appear, so as to obtain the first character pointers of all LMS substrings, and use the array P1 to record;

[0029] S103: According to the arrays P1, B, and SA, sort the LMS substrings of T by using the inductive sorting method, and save the result in the array SA1;

[0030] The specific steps of sorting the LMS substring of T in the step S103 are as follows:

[0031]31) Initialize the value of all elements of SA to be -1...

Embodiment 2

[0048] figure 2 It is a schematic structural diagram of a system for verifying the correctness of a suffix array in an embodiment. Such as figure 2 As shown, a correctness verification system of a suffix array includes: a character string reading module 1, an L / S suffix recognition module 2, an LMS suffix recognition module 3, an LMS substring sorting module 4, an LMS substring naming module 5, String contraction module 6, L-type suffix sorting module 7 and S-type suffix sorting and LMS suffix verification module 8; string reading module 1, used to read strings; L / S suffix recognition module 2, used to identify characters The string suffix type is L-type or S-type; the LMS suffix recognition module 3 is used to identify the LMS suffix in the S-type suffix; the LMS substring sorting module 4 is used to sort the LMS substrings by inductive sorting; LMS substring naming Module 5, naming the LMS substring. Compare the adjacent LMS substrings in the ordered LMS substrings, if ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a correctness verification method and system of a suffix array. The method includes the steps that T is scanned once from right to left, the size of a currently scanned character T[i] and the size of a subsequent character T[i+1] are compared according to the definition of suffix types, and the types of the characters T[i] and suf(T, i) of T are calculated and recorded in t[i]; T is scanned once from left to right, the positions of all LMS characteristics are found, and first-character pointers of all LMS sub-strings are obtained accordingly and recorded with an array P1; according to arrays P1, B and SA, the LMS sub-strings of T are sorted through an inducting and sorting method, and the result is saved in an array SA1; SA is scanned from left to right, and if SA[i] is of an LMS type, SA[i] is saved in SA1; whether characters in T1 are unique or not is judged, if yes, SA1 is directly calculated according to the name of T1, and an array C is updated through SA1; the suffix array SA of T is inductively calculated according to T1 and SA1, the correctness of SA is verified through the array C in the calculation process, and if SA is correct, the array C is updated through SA.

Description

technical field [0001] The present invention relates to the field of verification of suffix arrays, and more specifically, to a method and system for verifying the correctness of suffix arrays. Background technique [0002] Suffix array refers to a data structure that can realize the equivalent of suffix tree in a smaller space. It is a compact replacement of suffix tree and is widely used in many fields such as string processing, biological information retrieval, data compression, and pattern matching. Any given a string, the character substring composed of all characters from any position in it to the end is called the suffix of the string. Obviously, a string with a length of n contains n suffixes, sort these n suffixes in lexicographical order, and store their addresses in an integer array, which is called the suffix array of the string. [0003] The existing correctness verification method of the suffix array is to perform two rounds of integer sorting to verify the co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/22
CPCG06F40/194
Inventor 韩凌波农革徐文涛
Owner SYSU CMU SHUNDE INT JOINT RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products