Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Mail indexing and searching using hierarchical caches

a hierarchical cache and indexing technology, applied in the field of network data processing, can solve the problems of difficult to quickly index mailboxes, high storage costs of vast quantities of mail on the mailbox server,

Active Publication Date: 2016-07-19
ALIBABA GRP HLDG LTD
View PDF15 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0060]The embodiment of the present application adopts an approach involving level 1 caches, level 2 cache files, and level 3 inverted index files. The inverted index records established for mail messages are saved first to the level 1 cache; when the level 1 cache reaches a first preset threshold value, then all of the level 1 inverted index records in the level 1 cache are transferred to the level 2 cache file; and when the level 2 cache files reach a second preset threshold value, the level 2 inverted index records therein are transferred to the level 3 inverted index files. This approach enables the system to avoid an excessive number of writing operations on the hard disk resulting from having numerous inverted index records for mail. The result is increased inverted index record writing speed in the process of establishing mail indices. This in turn not only increases the speed of establishing mail indices, but also reduces the impact that comes from an excessive number of writing operations on the disk and improves disk IO performance.
[0061]Furthermore, part of the mail index in the embodiment of the present application is saved in the level 1 cache which is implemented using low latency memory. Consequently, the hard disk will not store the entire mail index. Therefore, if new mail is frequently processed, the buffer in the memory will prevent it from having an excessive impact on disk writing. Moreover, by raising the speed for the newest data, it can achieve the goal of real-time searching.
[0063]In this example, a third preset threshold value is also established for level 3 inverted index files; when the size of a level 3 inverted index file reaches the third preset threshold value, it is split into multiple (e.g., two) inverted index sub-files, and these inverted index sub-files must be no larger than the third preset threshold value. In this way, it is possible to guarantee that each level 3 inverted index file is not excessively large and thus to guarantee the access speed for mail indices.
[0058]In some embodiments, the writing speed is increased by employing an append mode to perform writing operations on the level 3 inverted index files. Thus, in some embodiments, when the level 2 inverted index records are written into the level 3 inverted index files, all the inverted index records in the level 2 cache file are fetched in append mode to the determined level 3 inverted index files according to the keywords. Append mode is the standard mode used to edit files. In append mode, new data is directly added to the end of the file.

Problems solved by technology

To store such large volumes of data will require large amounts of hard disk IO resources, making it difficult to impossible to quickly index mailboxes.
Furthermore, the storage costs of vast quantities of mail are very high for mailbox servers.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027]The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and / or a processor, such as a processor configured to execute instructions stored on and / or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and / or processing cores configured to process da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Electronic mail message processing includes: obtaining a set of keywords associated with an electronic mail message; updating, based at least in part on the set of keywords, a set of inverted index records stored in a level 1 cache; determining whether size of the set of inverted index records stored in the level 1 cache exceeds a first preset threshold value; in the event that the first preset threshold value is exceeded, transferring the set of inverted index records in the level 1 cache to a level 2 cache; determining whether size of a level 2 cache file exceeds a second preset threshold value; in the event that the second preset threshold value is exceeded, transferring, according to a path file, inverted index records in the level 2 cache file to a level 3 cache storing a set of inverted index files.

Description

CROSS REFERENCE TO OTHER APPLICATIONS[0001]This application claims priority to People's Republic of China Patent Application No. 201210357269.6 entitled METHOD AND SYSTEM FOR ESTABLISHING MAIL INDICES AND METHOD AND SYSTEM FOR SEARCHING MAIL, filed Sep. 21, 2012 which is incorporated herein by reference for all purposes.FIELD OF THE INVENTION[0002]The present application relates to the field of network data processing. In particular, it relates to a method and system for establishing mail indices to perform mail searches.BACKGROUND OF THE INVENTION[0003]As Internet communications become increasingly widespread, and with more and more users communicating by mail (specifically, electronic mail or email), mailbox searches have become an important search technique among data searches. Mailbox searches are typically based on mailbox indices. That is, all of a user's mail will typically be searched using a mailbox index.[0004]One existing method for establishing mail indices is as follows...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(United States)
IPC IPC(8): G06F17/30
CPCG06F17/30424G06F17/30622G06F17/30631
Inventor SHE, ZHIYONG
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products