Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Segmenting postings list reader

Inactive Publication Date: 2011-02-17
GLOBALSPEC
View PDF52 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0019]The present invention satisfies the above-noted need by providing a posting list reader that reads a posting list efficiently during inverted index searching by reducing the number of accesses to secondary storage as compared to a traditional buffered reading strategy that repeatedly uses a uniform input buffer size.
[0021]In accordance with the above, it is an object of the present invention to provide a segmenting posting list reader that can determine how many postings to read on each read request.

Problems solved by technology

A large inverted index may not fit into a computer's main memory, requiring secondary storage, typically disk storage, to help store the posting file, lexicon, or both.
Each separate access to disk may incur seek time on the order of several milliseconds if it is necessary to move the hard drive's read heads, which is very expensive in terms of runtime performance compared to accessing main memory.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Segmenting postings list reader
  • Segmenting postings list reader
  • Segmenting postings list reader

Examples

Experimental program
Comparison scheme
Effect test

case 1

[0105]To implement makePreciseBufferFillSizeSelector, there are two cases to consider, where numBytesToRead is the input to makePreciseBufferFillSizeSelector, and maxBufferSize is the largest read system call that can be issued in bytes:[0106]Case 1: maxBufferSize>=numBytesToRead[0107]Case 2: maxBufferSize

[0108]A discussion of these cases follows.

Case 1: maxBufferSize>=numBytesToRead

[0109]Build a one-stage predetermined buffer fill size strategy as indicated below in Table I.

TABLE IStageFill SizeNumber of Times to Use1numBytesToRead1

[0110]The above strategy, when installed in an enhanced buffered reader, will read exactly numBytesToRead bytes of data using a single system call.

case 2

Case 2: maxBufferSize

[0111]In this case, build a predetermined buffer fill size strategy that generally has two stages, as indicated in Table II. However, the second stage is not necessary when the maxBufferSize evenly divides numBytesToRead.

TABLE IIStageFill SizeNumber of Times to Use1maxBufferSizenumBytesToRead / maxBufferSize2numBytesToRead %1maxBufferSize

[0112]The above strategy, when installed in an enhanced buffered reader, will read exactly numBytesToRead bytes of data with the minimum possible number of read system calls.

case 3

[0113]To implement makeApproximateBufferFillSizeSelector, there are two cases to consider, where approximateNumBytesToRead is input to makeApproximateBufferFillSizeSelector, and maxBufferSize is the largest read system call that can be issued in bytes:[0114]Case 3: maxBufferSize>=approximateNumBytesToRead[0115]Case 4: maxBufferSize<approximateNumBytesToRead

Also, recall that a supplementalReadSize is provided as input to makeApproximateBufferFillSizeSelector. A discussion of these cases follows.

Case 3: maxBufferSize>=approximateNumBytesToRead

[0116]Build a two-stage predetermined buffer fill size strategy as indicated below in Table III.

TABLE IIIStageFill SizeNumber of Times to Use1approximateNumBytesToRead12supplementalReadSizeRepeat as necessary

[0117]The above strategy, when installed in an enhanced buffered reader, will read approximateNumBytesToRead bytes of data using a single read system call and thereafter will perform as many additional system calls of the supplemental read si...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A size of a posting list is determined as part of searching an inverted index. The posting list is segmented for reading into a plurality of segments based on the size. For example, the segmenting may be performed if the size is larger than a predetermined size. Finally, each of the plurality of segments is read into memory.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS[0001]This application claims priority under 35 U.S.C. §119 to the following U.S. Provisional Applications, which are herein incorporated by reference in their entirety:[0002]Provisional Patent Application Ser. No. 61 / 233,411, by Flatland et al., entitled “ESTIMATION OF POSTINGS LIST LENGTH IN A SEARCH SYSTEM USING AN APPROXIMATION TABLE,” filed on Aug. 12, 2009;[0003]Provisional Patent Application No. 61 / 233,420, by Flatland et al., entitled “EFFICIENT BUFFERED READING WITH A PLUG IN FOR INPUT BUFFER SIZE DETERMINATION,” filed on Aug. 12, 2009; and[0004]Provisional Patent Application Ser. No. 61 / 233,427, by Flatland et al., entitled “SEGMENTING POSTINGS LIST READER,” filed on Aug. 12, 2009.[0005]This application contains subject matter which is related to the subject matter of the following applications, each of which is assigned to the same assignee as this application and filed on the same day as this application. Each of the below listed ap...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F17/30622G06F16/319
Inventor FLATLAND, STEINARDALTON, JEFF J.
Owner GLOBALSPEC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products