Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

41 results about "Multi-document summarization" patented technology

Multi-document summarization is an automatic procedure aimed at extraction of information from multiple texts written about the same topic. The resulting summary report allows individual users, such as professional information consumers, to quickly familiarize themselves with information contained in a large cluster of documents. In such a way, multi-document summarization systems are complementing the news aggregators performing the next step down the road of coping with information overload.

Document summarization

Systems, methods, and other embodiments associated with automatically summarizing a document are described. One method embodiment includes computing term scores for members of a set of terms in a document to be summarized and computing sentence scores for sentences in a set of sentences in the document. The method embodiment also includes computing a set of entries for a term-sentence matrix that relates terms to sentences. The method embodiment also includes computing a dominant topic for the document and simultaneously ranking the set of terms and the set of sentences based on the dominant topic. The method embodiment provides a summarization item(s) selected from the set of terms and / or the set of sentences.
Owner:ORACLE INT CORP

Method and device for generating document summarization

ActiveCN104503958AReduce build timeImprove the efficiency of generating summariesSpecial data processing applicationsDocument summarizationGeneration process
The invention provides a method and a device for generating a document summarization. The method comprises the following steps: obtaining a document, processing the document by utilizing preset characteristics to obtain a summarization candidate sentence, wherein the preset characteristics comprise keywords, numbers and one or a plurality of sentences and subtitles which are far away from a title contained in the document for a preset range; carrying out compression processing to the summarization candidate sentence; and carrying out postprocessing on the summarization candidate sentence subjected to the compression processing to generate the document summarization. The summarization generated by the method and the device, which are disclosed by the embodiment of the invention, for generating a document summarization is concise and accurate, no redundant information exists in the summarization, a generation process is simple and does not need artificial participation, time for generating the document summarization can be greatly shortened, and efficiency on generating the document summarization is improved.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

System, method, and user interface for a search engine based on multi-document summarization

A method for searching multiple documents on a computer system includes steps for sending a query to a system core where the query is passed to a search component for searching the documents. The system core in turn receives results from the search component indicating related documents to the query and passes to a summarization component a specified number of the results. The summarization component processes related documents corresponding to the specified number of results to produce a multi-document summary. The system core receives the summary from the summarization component. The multi-document summary is received from the system core.
Owner:SOUBBOTIN DMITRI

Method and device generating multi-file summary

The invention embodiment discloses a method and device generating a multi-file summary, so the generated multi-file summary can have high coverage rate for multi-file important information, and redundancy can be reduced; the method comprises the following steps: destructing a multi-file sentence set into a phrase pool, and obtaining feathers and relations of each phrase in the phrase pool; selecting a phrase set, satisfying a preset constraint condition, from the phrase pool as a summary phrase set according to the features and relations; combining the selected summary phrase set into summary sentences according to a preset combination mode, thus forming the multi-file summary.
Owner:HUAWEI TECH CO LTD

Method and device for generating multi-document summarization

ActiveCN108733682AImprove performanceImprove measurement capabilitiesSpecial data processing applicationsSemantic vectorVerb phrase
The embodiment of the invention discloses a method and a device for generating a multi-document summarization, relates to the field of data processing and solves the problem of poor performance of a summarization generated by an existing automatic multi-document summarization technology. A specific scheme of the method comprises the steps of dividing multiple documents into n sentences; generatingan input word bag vector; performing unsupervised training on each sentence represented by the input word bag vector to obtain an encoding hidden layer vector of each sentence and a potential semantic vector of each sentence; collecting m potential semantic vectors; obtaining m decoding hidden layer vectors and m output word bag vectors according to the m potential semantic vectors; updating them decoding hidden layer vectors and the m output word bag vectors; estimating an importance degree of each sentence; acquiring the importance degree and a redundancy degree of a verb phrase of each sentence and the importance degree and the redundancy degree of a noun phrase of each sentence; and generating the summarization of multiple documents according to the importance degree and the redundancy degree of all noun phrases and the importance degree and the redundancy degree of all verb phrases. The embodiment of the invention is used for a process for generating the multi-document summarization.
Owner:HUAWEI TECH CO LTD

Multiple file summarization method facing subject or inquiry based on cluster arrangement

To overcome the defect in prior art, the related method considers fully the relation between sentences and the relation between sentence and user query to generate the abstract both with main file information and topic explanation or query answer, and applies difference penalty algorithm to ensure the novelty of abstract. This invention can meet individual request.
Owner:PEKING UNIV

Multi-document abstract sentence generating method

InactiveCN104778157ATaking into account the amount of informationTake into account the lengthSpecial data processing applicationsFeature vectorNatural language processing
The invention discloses a multi-document abstract sentence generating method, which comprises the following steps that S1, a sentence feature vector space is used as input, sentences are subjected to clustering analysis according to the sentence feature vector similarity, and each cluster obtained through calculation is recorded as a sub theme; S2, the important degree of each sub theme is determined according to the document set covering degree of each sub theme and the number of contained sentences, and in addition, the sub themes are sequenced according to the important degree; S3, the sentences under each theme are graded and sequenced; S4, the sentences with the highest important degree grades in each sub theme are extracted out to be used as abstract sentences, demonstrative pronouns used as subjects in the sentences are replaced, in addition, the abstract sentences are sequenced according to the impart degree degrades of the sub themes of the sentences, and finally, abstracts are generated and output.
Owner:SOUTH CHINA UNIV OF TECH +2

System, method, and user interface for a search engine based on multi-document summarization

A method for searching multiple documents on a computer system includes steps for sending a query to a system core where the query is passed to a search component for searching the documents. The system core in turn receives results from the search component indicating related documents to the query and passes to a summarization component a specified number of the results. The summarization component processes related documents corresponding to the specified number of results to produce a multi-document summary. The system core receives the summary from the summarization component. The multi-document summary is received from the system core.
Owner:SOUBBOTIN DMITRI

Method for modeling dynamic multi-document abstracts

The invention relates to a method for modeling dynamic multi-document abstracts, and aims to solve the problem that the contents, and the distribution and association conditions of various information sides under current subjects are difficult to globally master, so that a large number of abstract fragments come from the same subject, and comprehensiveness of abstract is seriously influenced in the traditional multi-document abstract method. The method specifically comprises the followings steps of: preprocessing a document collection; building a characteristic extracting module; building an information filtering module; building a sentence weighting module; building an abstract generation module to generate a best abstract; and outputting the best abstract using by using an output module to finish the modeling of dynamic multi-document abstract. By the method, the dynamically evolved abstract has relatively high information novelty, and evolution of history information, so that the performance of the dynamic abstract is improved. The abstract acquired by the method is more comprehensive. And the method is applied to an abstract extracting field.
Owner:HARBIN INST OF TECH

Evolutionary summarization generation method for internet news events

The invention relates to an evolutionary summarization generation method for internet news events. The evolutionary summarization generation method includes the steps: inputting a related news document set; representing documents as topic facture vectors by an LDA (latent Dirichlet allocation) topic model; clustering the documents represented as the topic facture vectors; calculating local scoresof the documents in each topic; calculating global scores of the documents in each topic; calculating final scores of the documents in each topic; extracting document titles with high scores from eachtopic to serve as a summary according to time sequence; outputting the summary. The dimensions of the topic facture vectors are first preset values, and each cluster represents one topic. According to the evolutionary summarization generation method for the internet news events, the extracted summary has dynamic evolvability and is coherent and strong in readability, and experimental results indicate that the system is greatly improved in terms of redundancy, coherence and dynamic evolvability as compared with a traditional multi-document summarization system.
Owner:SUZHOU UNIV

Method for automatically generating unsupervised science and technology intelligence abstract based on multi-sentence compression

The invention relates to an unsupervised scientific and technological intelligence abstract automatic generation method based on multi-sentence compression, and belongs to the technical field of natural language generation. Aiming at multi-document text generation in the field of science and technology intelligence, firstly, source data are acquired based on a topic crawler of an LDA topic similarity word library extension method; and sorting all text paragraphs through a text information value evaluation model of three indexes of authority, timeliness and content correlation of the text information. And selecting a paragraph with a higher score as an original text for generating the final science and technology intelligence. Finally, an unsupervised multi-document abstract method based on spectral clustering and multi-sentence compression is adopted, and a science and technology intelligence abstract is automatically generated. According to the method, the problem that in the data screening process, scientific and technological information generation has high requirements for data timeliness and authority is effectively solved, and the problem that a traditional multi-document generation method based on a neural network cannot be applied due to lack of a data set in the field of scientific and technological information is effectively solved.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products