The invention discloses a
suffix array indexing method for a real-
time data stream. The method comprises the following steps: a
server receives the real-
time data stream, extracts
source data, and pretreats the
source data into documents;
parsing the document, distributing the document according to the domain, receiving the
source data in each domain, and starting an independent thread to index and store the data;
a domain consists of a plurality of segments. After receiving the source data, the domain object writes the source data directly into the segments and sets the segment source data update
signal to return the response. If all domains of the document return a response, the response information is returned to the
client; the
suffix array construction tool listens for the segment source data update
signal in the background, automatically constructs the
suffix array for the segment source data, and generates the segment suffix array; a segment source data, a segment suffix array,and a segment information are linked into a full suffix array index, and the source data is indexed successfully. The invention can index heterogeneous data in real time without word segmentation, andadopts asynchronous mode to generate index to accelerate
response time. The invention is suitable for data indexing field.