Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

145 results about "Automatic summarization" patented technology

Automatic summarization is the process of shortening a text document with software, in order to create a summary with the major points of the original document. Technologies that can make a coherent summary take into account variables such as length, writing style and syntax.

Optimizing database queries using query execution plans derived from automatic summary table determining cost based queries

A method, apparatus, and article of manufacture for optimizing database queries using automatic summary tables. Query execution plans derived from an automatic summary table can be used to generate results for the query if a comparison of the query requirements with an automatic summary table definition determines that the automatic summary table overlaps the query, and if an optimization process determines that using the summary table will lower the cost of the query. The optimization process involves enumerating a plurality of query execution plans for the query, wherein the query execution plans enumerated include those that access combinations of query and summary tables. Each such query execution plan is assigned a cost representing an estimation of its execution characteristics, and the least costly query execution plan is selected for the query.
Owner:GOOGLE LLC

Method for extracting and processing network information and its system

The invention relates to a network information extracting and processing method, adopting artificial intelligence and natural language processing technique, able to automatically download daily up-to-date news and information from named websites, making content extraction, classification, automatic abstracting and retrenching full text, then storing the full text, and then indexing the full text for making high-efficiency full text retrieval in future.
Owner:陈文中

Automatic microblog text abstracting method based on unsupervised key bigram extraction

The invention discloses an automatic microblog text abstracting method based on unsupervised key binary word extraction. The automatic microblog text abstracting method comprises the steps of preprocessing a microblog; standardizing a binary word; extracting a key binary word based on a mixed TF-IDF (term frequency-inverse document frequency), TexRank and an LDA (local data area); sequencing sentences based on the intersection similarity and a mutual information strategy; extracting abstract sentences based on a similarity threshold value; generating abstract by reasonably combining the abstract sentences. According to the automatic microblog text abstracting method, the binary word is used as a minimum vocabulary unit, and the binary word has richer text information than words, so that the sentences based on the key binary word is higher in noise immunity and accuracy than the sentences based on key word extraction; meanwhile, when the abstract sentences are extracted, the similarity threshold value is introduced to control redundancy, so that the abstract is higher in recall rate. The abstract generated by the method is accurate, simple and comprehensive; the efficiency and the quality that a user acquires knowledge are obviously improved, and the time of the user is greatly saved.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Annotating programs for automatic summary generations

Audio / video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.
Owner:MICROSOFT TECH LICENSING LLC

Funnel type data gathering, analyzing and pushing system and method for online public opinion

InactiveCN104408157AImplement topic tracking analysisRealize network public opinion monitoringData processing applicationsWeb data indexingProcess moduleThe Internet
The invention discloses a funnel type data gathering, analyzing and pushing system and method for online public opinion. The funnel type data gathering, analyzing and pushing system comprises an online public opinion gathering module, an online public opinion processing module and an online public opinion publishing module, and the modules comprise a directed precise gathering sub-module, a non-directive gathering sub-module, a hot spot and sensitive topic identifying sub-module, a topic tracking sub-module, an automatic abstracting sub-module, a comprehensive analysis sub-module, a public opinion pre-warning sub-module and a multi-dimensional public opinion information display sub-module. The funnel type data gathering, analyzing and pushing method for the online public opinion uses a special public opinion funnel algorithm and uses the lexicons of three types of keywords of related to me, public opinion and positive and negative aspects to analyze, judge and classify the gathered data and warn early to grasp the latent change rule. The funnel type data gathering, analyzing and pushing system and method for the online public opinion reduce the manual public opinion event polling burden, duly and precisely grasp the development trend of the public opinion event, form the latest, hottest and sensitive topics in the recent period on the Internet, and detect the public opinion message what the user is concerned about and give an early warning in the first time.
Owner:SICHUAN ESLITE ELECTRONICS COMMERCE CO LTD

Automatic standardized filing method based on text semantic mining

The invention relates to an automatic standardized filing method based on text semantic mining. The automatic standardized filing method is characterized by comprising the steps: crawling files from a website, and carrying out information extraction, key word extraction and automatic abstract generation on the crawled file and a local file by utilizing text semantics, and finally storing into an informatization system. For the information extraction, a rule set is established by adopting a knowledge engineering method, information points are automatically extracted from the file to form structural data; for the key word extraction, a key word is extracted according to a position and semantics of a word in a text to generate a key word index; for the automatic abstract generation, a content contained by the abstract is firstly set, corresponding information is extracted from the text, the similarity of sentences is calculated, and the texts including the key information in the original file are extracted. By adopting the automatic standardized filing method, business personnel do not need to read a great amount of files, time and labor are saved, and convenience in inquiry and application can be realized.
Owner:MERIT DATA CO LTD

Computer user interface for audio and/or video auto-summarization

A relativity controller is a scroll bar / window combination that provides a way to see data in relation to both the context of its wholeness and the salience of its contents. To accomplish this, the linear density or other appearance of the scroll bar (acting as a ruler or scale) varies with the density of the document salience (as indicated by different kinds of annotations or marks). It also provides a way to zoom between perspectives. This is usable on many different data types: including sound, video, graphics, calendars and word processors.
Owner:MONKEYMEDIA

Rule-based method for patent abstract automatic extraction and keyword indexing

The invention a rule-based method for patent abstract automatic extraction and keyword indexing, which mainly comprises the steps: automatically marking key words such as characteristic and technique words and phrases in the full text of the patent literature according to a background knowledge base; determining the functions and mutual relationships of paragraphs in the article according to the types, times, position relations and the like of the occurrence of the characteristic words and phrases in the paragraphs; extracting key paragraphs of the paragraphs to form the extract; and finally, extracting key works from the extract to form the index items of the literature. The method for patent abstract automatic extraction and keyword indexing of the invention consists of five modules: a knowledge base module, a characteristic work marking module, a paragraph analysis and evaluation module, an extract automatic writing module and an indexing module. The method of the invention can obviously improve the efficiency of the deep processing of patent data and reduce the cost of the data processing. And the indexing result has a high retrieval value.
Owner:北京中献电子技术开发有限公司

Short text automatic abstracting method and system based on double encoders

The invention discloses a short text automatic abstracting method and system based on double encoders, belongs to the technical field of information processing, and is characterized by comprising thefollowing steps: 1, preprocessing data, 2, designing a double encoder with a bidirectional recurrent neural network, and 3, arranging an attention mechanism fusing global and local semantics 4, arranging a decoder with empirical probability distribution and using a decoder designed by adopting a double-layer unidirectional neural network; 5, adding word embedding characteristics, 6, optimizing word embedding dimensions, and 7, carrying out preprocessing and testing on the news corpus data from the Sogou laboratory, substituting the news corpus data into a Seq2Seq model with double encoders andaccompanying empirical probability distribution to carry out calculation, and carrying out experimental evaluation through a text abstract quality evaluation system Rouge. According to the invention,traditional weaving is carried out; and the decoding framework is subjected to optimization research, so that the model can fully understand text semantics, and the fluency and precision of text abstracts are improved.
Owner:CIVIL AVIATION UNIV OF CHINA

Solution data edit processing apparatus and method, and automatic summarization processing apparatus and method

The present invention achieves edit processing which enables a user to freely edit solution data that becomes supervised data when processing automatic summarization using machine learning, and achieves summarization processing specialized for the user using the solution data. A user's evaluation of the summary produced by automatic processing of a text is created, and case data using the text and the summary as a problem and the input evaluation as a solution is stored. A solution and feature-set pair is extracted from the stored cases, and learning result of what solution is apt to be produced at what feature is stored. Thereafter a summary-candidate is generated from the processing target text, a feature-set is extracted from the text and summary-candidate, a summary-candidate and estimated-solution pair is generated by estimating a feature-set referring to the stored learning result, and the summary-candidate pair is used as a summary.
Owner:NAT INST OF INFORMATION & COMM TECH

Machine learning-based Chinese automatic summarization method

The invention provides a machine learning-based Chinese automatic summarization method which comprises the following steps: inputting a text, and preprocessing the text; performing text structure division on the preprocessed text information, dividing the preprocessed text into a plurality of semantic paragraphs representing different themes, and calculating the importance degrees of the semantic paragraphs and the importance degrees of paragraphs; performing concept acquisition on the preprocessed text, converting all word expressions in the text into concept expressions, and calculating the importance degree of a concept, the frequency of the concept and the position of the concept; calculating the importance degrees of sentences according to structure information acquired by text division, the frequency of the concept, the position of the concept, the importance degree of the paragraphs and the importance degree of the semantic paragraphs; extracting the sentences with the importance degrees greater than preset values from all the semantic paragraphs; and arranging the sentences with the importance degrees greater than the preset values according to the original order, and outputting the sentences as a summarization result. The machine learning-based Chinese automatic summarization method can automatically generate a summary of a Chinese text.
Owner:BEIJING DINGTAI ZHIYUAN TECH CO LTD

Method and system for automatically abstracting pictures and texts from commodity-related network article

The invention provides a method and a system for automatically abstracting pictures and texts from a commodity-related network article. The method comprises the following steps of searching network articles on an Internet; screening the network article related with the commodity with particular topic from the searched network articles, extracting the corresponding commodity name, correlating the screened network article and the corresponding commodity name, and storing into a database of the commodity with particular topic; respectively obtaining pictures embedded into all network articles related with each commodity from the database of the commodity with particular topic, respectively screening the representative picture of each commodity from the pictures related with each commodity, and storing the representative picture of each commodity into the database of the commodity with particular topic. The system for automatically abstracting the pictures and texts has the advantages that by adopting the automatic abstracting technique, the different information sources are summarized, the commodity information, such as representative pictures and comment abstracts, of the commodity are provided, the intuitional data is provided for a user, and the query of the user is convenient.
Owner:FU TAI HUA IND SHENZHEN +1

Semantics-based sci-tech information processing method and system

The present invention discloses a semantics-based sci-tech information processing method and system, and belongs to the technical field of data processing. The method comprises the following steps: acquiring network data; according to a Chinese-English bilingual parallel corpus, translating the network data into Chinese / English by means of a decoding algorithm; generating an abstract according to the translated network data; performing classification according to the abstract, and generating a class tag; and storing the translated network data, the abstract and the class tag into a full-text retrieving database. According to the method and system disclosed by the present invention, by using technologies such as automatic search of sci-tech information, automatic abstracting of the sci-tech information and automatic classification of texts, sci-tech information related to scientific development, technical innovation and recent news can be automatically acquired by means of a public information channel from the Internet, so that acquisition accuracy is improved, the cross-language content understanding barrier is eliminated, the problem of information overload is solved, and the efficiency of reading and understanding information of the user is increased.
Owner:THE 28TH RES INST OF CHINA ELECTRONICS TECH GROUP CORP

Public opinion monitoring method and device, and computer readable storage medium

InactiveCN107330613AImprove qualityReduce manual review workloadWeb data indexingResourcesData sourceWorkload
The invention provides a public opinion monitoring method and device, and a computer readable storage medium. A data source is acquired through an acquisition layer, then the page analysis, Chinese word segmentation, positive and negative recognition, keyword extraction, automatic classification, automatic abstracting, data cleaning and other processing are performed by an analysis layer, and afterwards, the score results are displayed by a presentation layer. Through the positive and negative semantic recognition of the content of online reviews, combined with a clinic's own basic information, a score is given to the clinic, at the same time, the on-line audit aids are completed, the manual audit workload can be reduced to a certain extent, and at the same time the improvement of the company on-line clinic quality is facilitated.
Owner:PING AN TECH (SHENZHEN) CO LTD

Method for extracting content of text based on HTML characteristics

A method for picking up test content based on HTML feature includes utilizing countermark to decompose inputted HTML webpage to be multiple module, keeping decomposition on decomposed module if module is able to be continuously decomposed without table occurrence, setting different position score on inputted module according to different position in layout and calculating the chaining character length of each module and test length in super-chaining of each module for obtaining integrated score of each module according to the formula.
Owner:上海新纳广告传媒有限公司

Text processing and search system based on big data platform

The invention discloses a text processing and search system based on a big data platform. The system comprises a text processing portion based on Hadoop and a distributed search function portion based on Hadoop, wherein the text processing portion based on Hadoop comprises a text extraction module and the like; the distributed search function portion based on Hadoop comprises a semantic annotation module and a distributed memory sharing-based search module. According to the system provided by the invention, the text data with different formats and different codes can be processed; more comprehensive text processing operations such as content extraction, text word segmentation, index building, entity identification, keyword extraction, autoabstract, text clustering and automatic classification are performed on the text, to fully explore information and value included by the text data; a text processing result can be released out via a service interface, so that interaction and expansibility of the system are improved; a distributed memory sharing-based full-text search technology is used, so that full-text search efficiency after the text is processed is improved.
Owner:NO 32 RES INST OF CHINA ELECTRONICS TECH GRP

Automatic text abstracting method based on pre-trained language model

PendingCN111723547ASolve the shortcomings of not being able to obtain long-distance informationOvercome the shortcoming of insufficient long-distance access to informationSemantic analysisNeural architecturesTheoretical computer scienceEngineering
The invention relates to an automatic text abstracting method based on a pre-trained language model, and belongs to the technical field of natural language processing. The method comprises the following steps: encoding source text information by using a pre-trained language model BERT network; and then automatically generating an abstract for the source text through an LSTM joint attention mechanism. According to the method, in an automatic abstract task of the Chinese text, the generated Chinese abstract achieves good readability, the quality of the generated abstract is high, meanwhile, themodel training speed is high, and due to the fact that the pre-training language model serves as an encoder, the abstract with the relatively high quality can be generated even under the condition offew training data.
Owner:HOHAI UNIV

Mobile terminal task assessment method and system based on Internet of Things

The invention discloses a mobile terminal task assessment method and system based on Internet of Things. The mobile terminal task assessment method based on the Internet of Things comprises steps of performing submission at a mobile terminal after accomplishing a task, entering a background management unit by task information to wait for task evaluation, performing point or virtual golden coin calculation according to a standard, waiting for a leader to finish final approval after finishing task assessment, if the leader approves, sending the assessment information to an assessment display module and a middle-level performance score management module for processing, if the leader does not approve, anew choosing the standard to perform assessment submission, performing automatic summarization statistic and displaying a result in an assessment report and a middle-level performance report by the assessment display module and the middle-level performance score management module, and sending the points and the virtual golden coins to personal accounts of workers or user who accomplish the task. The mobile terminal task assessment method and system based on the Internet of Things utilizes the advanced mobile Internet application technology to combine with the enterprise performance assessment management work and realizes a real-time, objectified, data-orienting and informationized work task assessment method on the mobile terminal.
Owner:广州时刻销销网络科技有限公司

Method for automatically abstracting Blog on basis of feature information

The invention discloses a method for automatically abstracting a Blog on the basis of feature information. The method includes steps of scoring sentences on the basis of the feature information; scoring attention of comments on the basis of latent semantics; and checking and merging abstract to obtain an abstract sentence set. The method has the advantages that the feature information of the Blog is sufficiently utilized, and focus of the attention in the comments is fused on the basis of the latent semantics, so that the reader-friendly abstract can be generated, and theme coverage and information redundancy are balanced by a process for checking the abstract; the problem of synonymous noise among comments and a text is solved by the aid of the relevance of the latent semantics; and the abstract generated by the method is reader-friendly and is high in accuracy.
Owner:SUZHOU UNIV

Graph model text abstract generation method based on word frequency and semantics

The invention discloses a graph model text abstract generation method based on word frequency and semanteme. The method comprises the following steps of 1) performing word segmentation on sentences ina text, and performing part-of-speech tagging; 2) filtering the lexical items, and only reserving the lexical items with specific part-of-speech; and 3) training word vectors by using a Word2Vec model and a BM25 algorithm to form a feature word vector set, further representing sentences, and constructing a sentence-word text matrix; 4) constructing a text undirected graph model through the text matrix; and 5) performing iterative computation of sentence node weights by using a TextRank algorithm until convergence, and selecting TOP-K sentences to generate text abstracts. 6) experimental results show that the method is suitable for industrial production, compared with a traditional text automatic abstracting method considering a single word frequency characteristic of a text and based on atext semantic characteristic, according to the method, under the optimal adjustment factor combination, a higher Rouge value is obtained, it is proved that the method effectively integrates text wordfrequency and semantic features, and then the abstract generation accuracy is improved through a TextRank algorithm based on contextual information.
Owner:LIAONING UNIVERSITY

Matching and compensation tests for optimizing correlated subqueries within query using automatic summary tables

A method, apparatus, and article of manufacture for optimizing database queries using an automatic summary table. A query is analyzed using matching and compensation tests between the query at least one correlated subquery within the query and the automatic summary table to determine whether expressions occurring in the query, but not in the automatic summary table, can be derived using the automatic summary table. If so, the query is rewritten so that the automatic summary table is used.
Owner:GOOGLE LLC

Text structure analysis-based Web document abstract generation method

The invention discloses a text structure analysis-based Web document abstract generation method. The method comprises the steps of using a URL (uniform resource locator) as input, integrating the webpage main bodies of visual features and text features for extraction, partitioning the main bodies into a plurality of semantic paragraphs, and abstracting each semantic paragraph, so the generated abstract has higher coverage rate. The text structure analysis-based Web document summary generation method realizes the generation of the text abstract with better quality from a Webpage aiming at the conditions that the Webpage structure is complex, the main body is hard to identify and the Chinese automatic abstract is still positioned in the probe stage.
Owner:EAST CHINA NORMAL UNIVERSITY

Data mining-oriented text processing system and method

The present invention provides a data mining-oriented text processing system. The system comprises: a text extraction module, a text segmentation module, an index establishing module, an entity identification module, a keyword extraction module, an automatic summarization module, an automatic classification module and a service interface module. The text segmentation module performs code conversion, conversion between simplified and traditional Chinese, and a part-of-speech tagging operation on a text extracted by the text extraction module. The index establishing module, the entity identification module, the keyword extraction module, the automatic summarization module and the automatic classification module are used for obtaining an index file, an entity word, a keyword, an abstract and a classification result of the text content. The service interface module is used for publishing output results of the index establishing module, the entity identification module, the keyword extraction module, the automatic summarization module and the automatic classification module in the form of a service to other systems for calling. The present invention also provides a data mining-oriented text processing method. The method is capable of providing a more complete text processing capability.
Owner:NO 32 RES INST OF CHINA ELECTRONICS TECH GRP

Data summarization

A database management system provides the capability to perform cluster analysis and provides improved performance in model building and data mining, good integration with the various databases throughout the enterprise, and flexible specification and adjustment of the models being built, but which provides data mining functionality that is accessible to users having limited data mining expertise and which provides reductions in development times and costs for data mining projects. A database management system for in-database clustering comprises a first data table and a second data table, each data table including a plurality of rows of data, means for building a clustering model using the first data table using a portion of the first data table, wherein the portion of the first data table is selected by partitioning, density summarization, or active sampling of the first data table, and means for applying the clustering model using the second data table to generate apply output data.
Owner:ORACLE INT CORP

Generating system of three-dimensional sheet metal welding technology

The invention discloses a generating system of a three-dimensional sheet metal welding technology. The generating system comprises a design technology synergy module, a technological compilation module and a three-dimensional sheet metal welding technology workshop application module, wherein the design technology synergy module is used for achieving a three-dimensional design model based on MBD, and achieving data sharing and data distribution as well as creation of a sheet metal welding working procedure model on a united data source; the technological compilation module is used for achieving structuring of technology elements of the sheet metal welding technology, working procedures, working steps and the like and an automatic summarization to generate two-dimensional technology procedure file; the three-dimensional sheet metal welding technology workshop application module is used for exhibiting the three-dimensional technological structure of the sheet metal welding technology and the two-dimensional technology procedure file. The generating system of the three-dimensional sheet metal welding technology has the advantages that synergy of a design department and a technological compilation department is achieved, so that technological compilation can obtain an accurate and sole data source from a design division, and after formation of structured data, strong convenience for retrieval, retrospect and extraction of design data and technology data is provided.
Owner:BEIJING POWER MACHINERY INST

Multilingual word segmentation method based on dictionaries and grammar analysis

The invention discloses a multilingual word segmentation method based on dictionaries and grammar analysis. Efficient and accurate word segmentation of mixed texts of Chinese, Japanese, Korean, Cantonese and the like can be realized, flexible lexicon expansion of words for different time periods and different professionals can be realized, lexicon information is updated effectively, and efficient and accurate multilingual language text word segmentation is realized; a word segmentation sub-device of Chinese, Japanese, Korean, Cantonese and other language families, a Chinese quantum word segmentation device and a western language word segmentation device are embedded to realize the accurate word segmentation of each language text; a text segment to be performed with word segmentation is segmented by a built-in language segment coded identification mechanism, each segmented text segment corresponds to a language family, and the word segmentation is carried out by using a corresponding word segmentation sub-device; the word segmentation of western inflectional languages and the smart mode word segmentation of the Chinese, Japanese, Korean, Cantonese can be realized by grammar analysis, and texts containing Arabic numeral information can be processed; and meanwhile, the word segmentation of texts with a plurality of mixed languages can also be realized by the multilingual word segmentation method provided by the invention, thereby getting rid of the limitation that a word segmentation tool can only realize the word segmentation of single language and some individual languages and ensuring the security, accuracy, efficiency, flexibility and universality of word segmentation of texts. The multilingual word segmentation method provided by the invention has a wide application prospect in the text word segmentation fields such as enhancement of mass data text classification, text information extraction, autoabstract, etc.
Owner:BEIJING SCISTOR TECH +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products