A trie-based spatial keyword query method and device
A query method and keyword technology, applied to instruments, unstructured text data retrieval, computing, etc., can solve problems such as retrieval efficiency constraints, and achieve the effect of avoiding multi-path query problems and low storage space overhead
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0043] Embodiment 1: as Figure 1-Figure 7 As shown, a Trie-based spatial keyword query method includes: a data preprocessing step, encoding all position points in the data set D into a string geoStr of length n, and lexicographically sorting the data according to the string geoStr suffix ssuf Each row of data in set D is sorted and generated with an ID number, each row of data is called a record r, and a data set consisting of one or more rows of records r is called a record set R; where ssuf refers to the last n-m characters of the string geoStr , m≤n, m represents the number of digits in the prefix part of the string geoStr;
[0044] The step of establishing the spatial keyword index is to construct a Trie for the string prefix spre, and the leaf node of the Trie points to the inverted index constructed according to the keywords in the field. The list elements of the inverted index are keywords and their corresponding id lists, and the space is obtained Keyword index struc...
Embodiment 2
[0106] Embodiment 2: As in embodiment 1, d≤d is given 1 Under the circumstances, the specific implementation process, this embodiment provides d>d 1 Situation, adopt the data in embodiment 1 to be described here as follows: Given query location point (19.596412-99.219501), query distance range 2000 meters, query keywords {historicalSite, garden}, as known by the geohash precision table, need p The corresponding distance error is not less than 2000 and is the minimum value, then the p value should be set to 5, and (19.596412 -99.219501) is encoded into a 5-digit string 9g3rq by the geohash algorithm. The geohash codes of the eight regions around 9g3rq are: 9g3rw, 9g3rx, 9g3rr, 9g3rp, 9g3rn, 9g3rj, 9g3rm, and 9g3rt, and 9g3rq and its surrounding eight regions are used as query domains. Because 2000>610, according to the geohash code retrieval Trie, select the inverted list to be retrieved, and then retrieve the selected inverted list, respectively obtain the id list containing ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com