Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

110 results about "Sentence extraction" patented technology

Sentence extraction is a technique used for automatic summarization of a text. In this shallow approach, statistical heuristics are used to identify the most salient sentences of a text. Sentence extraction is a low-cost approach compared to more knowledge-intensive deeper approaches which require additional knowledge bases such as ontologies or linguistic knowledge. In short "sentence extraction" works as a filter which allows only important sentences to pass.

Graph-based ranking algorithms for text processing

The present invention provides a method of processing at least one natural language text using a graph. The method includes determining a plurality of text units based upon the natural language text, associating the plurality of text units with a plurality of graph nodes, and determining at least one connecting relation between at least two of the plurality of text units. The method also includes associating the at least one connecting relation with at least one graph edge connecting at least two of the plurality of graph nodes and determining a plurality of rankings associated with the plurality of graph nodes based upon the at least one graph edge. The method can also include a graphical visualization of at least one important text unit in a natural language text or collection of texts. Methods for word sense disambiguation, keyword extraction, and sentence extraction are also provided.
Owner:NORTH TEXAS UNIV OF

Interactive speech recognition system and method

ActiveCN101923854AFix recognition errorsCandidate is accurateSpeech recognitionSpeech identificationAcoustic model
The invention discloses an interactive speech recognition system which comprises an acoustic model, a language model selection module, a speech and sentence extraction module, a speech recognition module, a word candidate generation and error correction module and an interaction module, wherein the acoustic model and the language model selection module are used for selecting an acoustic model which is the most similar to an object to be recognized in the pronunciation characteristic for the object to be recognized and a language model which is the most similar to the object to be recognized in the field for the whole recognition process according to the information of the object to be recognized; the speech and sentence extraction module is used for segmenting the whole section of a speech signal into a plurality of speeches and sentences, extracting the segmented speeches and sentences and sending to the speech recognition module; the speech recognition module is used for recognizing the speeches and the sentences extracted by the speech sentence extraction module and outputting an intermediate recognition result; the word candidate generation and error correction module is used for processing the intermediate recognition result to generate a candidate assembly and correcting recognition errors according to selected candidates or input correct data to obtain a final recognition result; and the interaction module is used for sending data input by a user to the acoustic model and the language model selection module and feeding back the recognition result of the word candidate generation and error correction module to the user.
Owner:INST OF COMPUTING TECH CHINESE ACAD OF SCI

Selective, contextual review for documents

A method, apparatus and computer-usable medium for the selective, contextual retrieval and presentation of information used in review processes. Search criteria is entered for information to be retrieved for review and information source documents are then searched, maintaining references to the contextual levels encountered (e.g., chapter, section, sub-section, etc.). Once the search is completed, a hierarchical list of sentence extracts matching the search criteria and indicating their location in their respective source document is presented to the reviewer. The user can then select any level of the hierarchical view, which then displays an expanded view of the related section of the source document. The retrieved information can be exported to a plurality of formats, which are annotatable by the reviewer. Users are thereby provided with information relevant to the subject under review, structured and presented in the context of its usage within its associated source document.
Owner:IBM CORP

Aspect-level sentiment analysis method and device based on graph convolutional neural network

ActiveCN112528672AMake up for the inaccurate defect of extracting syntactic featuresImprove accuracySemantic analysisNeural architecturesFeature extractionAlgorithm
The embodiment of the invention provides an aspect-level sentiment analysis method and device based on a graph convolutional neural network. The method comprises the steps of acquiring sentences to besubjected to aspect sentiment analysis and aspect words in the sentences to be subjected to aspect sentiment analysis; preprocessing the sentences and the aspect words to be subjected to aspect sentiment analysis to obtain input vector sequences and syntactic weighted graphs corresponding to the sentences to be subjected to aspect sentiment analysis; and inputting the input vector sequence and the syntax weighted graph into a pre-trained double-graph convolutional neural network to obtain an emotion analysis result corresponding to the aspect word. According to the embodiment of the invention, the dual-graph convolutional neural network not only pays attention to syntactic features of the sentences, but also pays attention to semantic features of the sentences and extracts semantic related features corresponding to the sentences, so that the defect that syntactic feature extraction of sentences insensitive to syntactics is inaccurate is overcome, and the accuracy of emotion analysis results is improved.
Owner:BEIJING UNIV OF POSTS & TELECOMM

Tendency analysis method of public opinion based on word2vec

The invention provides a tendency analysis method of a public opinion based on word2vec. The method comprises a vector training phase, a key sentence extraction phase and a tendency discrimination phase; by extracting news key sentences, the discriminant feature space is reduced, contents with relatively large relevance with the original theme are remained, useless information is eliminated and the accuracy of tendency analysis of the public opinion is improved; a depth learning model word2vec is introduced into the tendency analysis of the public opinion, and used for comparing the semantic similarity between words and comparing the semantic similarity through word vectors, so that words with the same emotional tendency but not in an emotion dictionary can be well identified, thus even if the emotion dictionary is not complete, a better analysis effect can be obtained, and meanwhile, weighted calculation is performed on the emotional tendency of the key sentences by fusing a grammatical rule; and combined with contextual information, the limitation of simply using the semantic similarity is compensated, and the tendency is analyzed integrally from the sentences, so that the emotional tendency and emotional intensity of news texts of the text level are accurately discriminated.
Owner:TONGJI UNIV

Cascading-type composition generating method

The invention relates to a cascading-type composition generating method. The technical purpose is to make up for the insufficiencies in the prior art that only composition scoring is studied, and there are no studies on a composition generating method yet, and it is hard to analyze a title of a composition through an existing subject analysis technology. One or more topic words are utilized to represent a main idea of a to-be-generated composition; after the topic words are obtained, composition generation is broken up into the topic word expansion, sentence extraction and the text organization; after the topic words are expanded, a sentence extracting module is utilized to look for sentences relevant to the topic words, finally a text organization module is utilized to rank the extractedsentences, so that the sentences become a coherent entirety. By means of the cascading-type composition generating method, phrases can also be mined from an extracted sentence set to make supplement for existing topic words. The cascading-type composition generating method is applicable to automatically generating compositions.
Owner:语仓科技(北京)有限公司

Artificial intelligence-based title rewriting processing method, device and readable medium

The invention provides an artificial intelligence-based title rewriting processing method, device and readable medium. The method comprises steps of acquiring a feature expression of each sentence ofan article, extracting supporting sentences each sentence of the article according to the feature expression of each sentence and a pre-trained supporting sentence extraction model, generating a candidate title corresponding to the supporting sentences of the article according to the supporting sentences of the article and a pre-trained title generation model, and determining whether to use the candidate title to rewrite an original title of the article according to the original title of the article, the candidate title and a pre-trained click rate pre-estimation model. The feature expressionof the sentence comprises sentence information features and similarity features between the sentences and the original title of the article. According to the technical scheme, title quality after rewriting process can be improved as long as the title of the article is rewritten and recall rate of the article with the rewritten title can be improved, so practical title rewrite demand can be met.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Method and apparatus for searching similar sentences

An apparatus for searching similar sentences that has a translation sentence database includes an input unit to which a sentence is input; first language processing unit configured to perform language processing on sentences input through the input unit; and first language similarity calculating unit configured to refer to previously translated sentences to extract similar sentences for the first language sentence. Further, the apparatus includes translating unit configured to translate a sentence into a second language sentence; second language processing unit configured to perform language processing on a second language sentence; second language similarity calculating unit configured to refer to the previously translated sentences to extract similar sentences for the second language sentence; and a re-ranking unit configured to combine similar sentence extracting results of the first language with those of the second language to re-rank sentence outputs.
Owner:ELECTRONICS & TELECOMM RES INST

Automatic metaphor rhetoric sentence analysis and judgment method based on part of speech, syntax and dictionary

The invention discloses an automatic metaphor rhetoric sentence analysis and judgment method based on part of speech, syntax and a dictionary. A randomly input sentence is taken as a processing object, and the method comprises the following steps: (1) segmenting words and carrying out part-of-speed tagging; (2) deleting a modifier based on syntactic analysis; (3) deleting redundant ingredients based on a simple subordinate clause; (4) deleting redundant ingredients based on a metaphoric word; (5) reducing a range of a candidate body and a candidate metaphorical object by virtue of dependency relation; (6) screening the candidate body and the candidate metaphorical object according to the dependency relation formed by corresponding words of root nodes; (7) extracting the candidate body and the candidate metaphorical object according to a simple metaphoric sentence extraction rule; and (8) realizing automatic analysis and judgement on a metaphor rhetoric sentence based on automatic judgement of a metaphor rhetoric technique. The method disclosed by the invention is high in automation degree and high in judgement accuracy rate and can be widely applied to automatic metaphor rhetoric analysis and judgment systems of the fields of natural language deep understanding, machine translation, computer-assisted instruction and the like.
Owner:南城县工业与科技创新投资发展集团有限公司

Traditional Chinese medicine medical case naming identification method and system based on multi-feature template correction

The invention discloses a traditional Chinese medicine medical case naming identification method and system based on multi-feature template correction. The method comprises the following steps that: carrying out sentence extraction on a traditional Chinese medicine medical case text; classifying extracted sentences; carrying out word segmentation processing on each category of sentences; carryingout character feature, part of speech feature, left demonstrative word feature, right demonstrative word feature and term feature annotation on each word obtained by word segmentation in sequence; constructing training corpora; formulating a feature template; independently inputting the obtained corpora and the feature template into a conditional random field model, and training the conditional random field model to obtain the trained conditional random field model; constructing corpora to be predicted for a traditional Chinese medicine medical case to be predicted; taking constructed identification corpora as an input, inputting into the trained conditional random field model, and outputting a traditional Chinese medicine medical case category and a character position; and finally, according to the traditional Chinese medicine medical case category and the character position, identifying the four diagnostic methods, the pattern of syndrome and the therapeutic method of the traditionalChinese medicine.
Owner:山东管理学院

Chinese web page text deduplication system and method

The invention discloses a Chinese web page text deduplication system and a Chinese web page text deduplication method. The deduplication system comprises an index server and a search server, wherein the index server comprises a web page text preprocessing module, a combined characteristic sentence extraction module and a digital signature calculation module; and the search server comprises a web page text capture module and a Hash query module. The deduplication method comprises the following steps of: normalizing a web page text; extracting a combined characteristic sentence of the text; calculating a digital signature of the combined characteristic sentence; and comparing the digital signature with the existing digital signature in a Hash table, and judging whether the digital signature is duplicated or not. By the deduplication system and the deduplication method, a search engine can quickly and accurately determine and remove a large number of Chinese web pages with duplicated contents in the Internet; and when the search engine captures a new web page, the digital signature of the web page is calculated and compared with the digital signature of the web page, which has been stored by the search engine, whether the web page is duplicated or not is judged, and the web page is not stored if the web page is duplicated, so that the waste of a storage space is avoided, and the search accuracy of the search engine is improved simultaneously.
Owner:SHENGLE INFORMATION TECH SHANGHAI

System for entity and evidence-guided relation prediction and method of using the same

System and method multitask prediction. The system include a computing device. The computing device has a processer and a storage device storing computer executable code. The computer executable code is configured to: provide a head entity and a document containing the head entity; process the head entity and the document by a language model to obtain head extraction corresponding to the head entity, tail extractions corresponding to tail entities in the document, and sentence extraction corresponding to sentences in the document; predict a head-tail relation between the head extraction and the tail extractions using a first bilinear layer; combine the sentence extraction and a relation vector corresponding to the predicted head-tail relation using a second bilinear layer to obtain a sentence-relation combination; and predict an evidence sentence supporting the head-tail relation using a third bilinear layer based on the sentence-relation combination and attention extracted from the language model.
Owner:BEIJING WODONG TIANJUN INFORMATION TECH CO LTD +1

A text abstract generation method based on a K-means model and a neural network model

The invention discloses a text abstract generation method based on a K-means model and a neural network model, and the method comprises the steps of preprocessing an original text, obtaining single sentences and words through segmentation, inputting the sentences and words into a doc2vec model, and carrying out the training, so as to obtain sentence vectors; determining the number of clustering centers of the original text, inputting the sentence vector into an unsupervised K-means model, and training to obtain a clustering center vector; calculating the Euclidean distance between the clustering center vector and the sentence vector, and extracting a sentence corresponding to the sentence vector closest to the clustering center to serve as a reference abstract; and inputting the original text, the reference abstract and the words into a generative neural network model to generate a text abstract. The method has the beneficial effects that the unsupervised model and the supervised neural network model are combined, so that the generated text abstract can be semantically coherent and is convenient for a user to understand.
Owner:桂林远望智能通信科技有限公司

Event sentence extraction method for financial field

The invention relates to an event sentence extraction method for a financial field. The method comprises the following steps of 1) performing company name identification by utilizing internet search and listed company name information; 2) by comprehensively considering characteristics in four aspects of statement positions, company name information, field verb information, and statement and title similarity, constructing weight expression; and 3) extracting financial event sentences from a sentence set. The invention provides an internet information-based company name identification method; few rules are used, identification is not limited by training corpora, preparations can be made fully for event sentence extraction and event element identification, and the problems of frequent use of abbreviations and serious colloquial phenomenon during company name identification are solved; and sentences are subjected to comprehensive weight calculation in the four aspects of the company name information, the field verb information, the statement and title similarity and the statement positions, and finally the financial event sentences are selected out, so that the financial event sentences can be efficiently identified and extracted.
Owner:CAPITAL NORMAL UNIVERSITY +1

Online corpus alignment method and system

The invention discloses an online corpus alignment method and system. The method comprises the steps of analyzing a bilingual inter-translated file to obtain a result file; performing paragraph adjustment on the result file to enable paragraphs between an original text and a translated text to correspond; automatically performing sentence segmentation on the original text and the translated text through a preset sentence segmentation rule to obtain original text sentences and translated text sentences, and performing calculation according to a preset arrangement rule to obtain arrangement combinations of the original text sentences and the translated text sentences; and calculating sentence similarity corresponding to each arrangement combination of the original text sentences and the translated text sentences, and selecting the arrangement combination with the maximum similarity as a final sentence-sentence alignment result. According to the method and the system, the accuracy of alignment can be improved.
Owner:上海一者信息科技有限公司

Multi-feature fusion Chinese-over-the-sea news viewpoint sentence extraction method

The invention relates to a multi-feature fusion Chinese-overtopped news viewpoint sentence extraction method, and belongs to the technical field of natural language processing. Firstly, a cross-language representation learning method is adopted to construct a Chinese-Vietnamese bilingual word embedding model; and then calculating feature weights of the topic, emotion and position of the sentence,and fusing the feature weight information into a coding layer and an attention mechanism to obtain representation of the sentence in the aspects of topic, emotion, position and the like. And finally,viewpoint sentence classification is carried out according to the obtained sentence representation. Aiming at the problem that Chinese and Vietnamese marking resources are unbalanced, a Chinese-Vietnamese bilingual word embedding model is constructed; according to the method, the sentences are extracted from the sentences, then the weights of the topics, the positions and the sentiment features ofthe sentences are calculated respectively, the sentence weights are fused into the word vectors and the attention mechanism respectively, sentence semantic information and the sentiment, topic and position features are combined, and the accuracy of extracting the sentences of the Hami news viewpoints can be effectively improved.
Owner:KUNMING UNIV OF SCI & TECH

Method and system for processing text semantics by utilizing image processing technology and semantic vector space

The invention belongs to the technical field of text semantic information processing, and in particular relates to a method and a system for processing text semantics by utilizing an image processing technology and semantic vector space. The system comprises a text input and preprocessing module, a semantic vector construction module, a semantic information processing module and a semantic processing result display module, wherein the semantic information processing module is specifically used for semantic turning sentence extraction, semantic noise sentence detection, semantic range tracking and semantic scene segmentation. According to the method and the system, a text unit is mapped to a pixel in an image, and a semantic vector which describes the text unit is taken as pixel grayscale of the image, so that various technologies and methods in an image processing field can be introduced to process a text flexibly and intuitively without the influence of the diversification of word forms; meanwhile, the semantic vector is constructed by instructing a Word2Vec method, so that the lightweight of algorithms is ensured to meet the requirements on real-time application.
Owner:SHANGHAI JILIAN NETWORK TECH CO LTD

Text-summarization generating method based on neural network

The invention provides a text-summarization generating method based on a neural network. The text-summarization generating method based on the neural network includes the steps that an input documentis subjected to word segmentation and vectorization expression, and word vectors are obtained; all obtained word vectors of all sentences are input into a first layer of a first circulation neural network in sequence, and state vectors of the sentences after current word vectors of the sentences are input are obtained, wherein the state vectors of the corresponding sentences after the last word vectors of all the sentences are input represent sentence vectors of the sentences; all the sentence vectors are input into a second layer of the first circulation neural network in sequence, and corresponding document state vectors after all the sentences are input into the document are obtained, wherein the corresponding document state vector after the last sentence is input is a state vector of the whole document; expression of the input document is decoded through a second circulation neural network, and summarization is generated. According to the text-summarization generating method basedon the neural network, the cost problem when summarization is manually generated is solved, and meanwhile the information fragmentation problem and the information ambiguity problem which are caused by a sentence extraction method are solved.
Owner:北京牡丹电子集团有限责任公司数字科技中心

Key sentence extraction method and device for text paragraph

The invention provides a key sentence extraction method and device for a text paragraph, and relates to the technical field of data processing. The key sentence extraction method comprises the following steps: carrying out word segmentation on each line of text of each text paragraph in a corpus to obtain a first word segmentation result; selecting effective vocabularies from the first word segmentation result; selecting a key paragraph from each text paragraph according to the composition structure of the text paragraph; classifying the key paragraphs according to the effective vocabularies to obtain a plurality of classification categories; determining a target keyword of each classification category and the weight of each target keyword; and extracting a key sentence of each key paragraph according to the target keyword and the weight of the target keyword. According to the technical scheme, for the customer service industry, classification of customer service question and answer data and extraction of user demands and purposes are achieved, and the period of customer service knowledge accumulation is greatly shortened, and the cost of customer service knowledge accumulation isreduced, and meanwhile a complete problem set is provided for subsequent intelligent customer service.
Owner:HANGZHOU WEIMING XINKE TECH CO LTD +1

Information processing apparatus, information processing method, and program

An information processing apparatus includes a category classifying unit configured to classify a document into one or more categories, a word extracting unit configured to extract one or more words from the document, a word score calculating unit configured to calculate a word score for each of the one or more words extracted from the document on the basis of an appearance frequency of the word in each of the one or more categories, the word score serving as an index of interest of the word, a sentence-for-computation extracting unit configured to extract one or more sentences from the document, and a sentence score calculating unit configured to calculate a sentence score for each of the extracted one or more sentences on the basis of the word score calculated by the word score calculating unit, the sentence score serving as an index of interest of the sentence.
Owner:SONY CORP

A shape filling type reading understanding analysis model and method based on reinforcement learning

The invention discloses a complete form filling type reading understanding analysis model and method based on reinforcement learning. The model comprises a coding layer, which is used for vectorizingwords of an original text, coding the words, taking a hidden vector of the last word of each sentence, outputting the hidden vector as a sentence vector, coding the text into a sequence of sentence vectors, and transmitting the sequence to a sentence extraction layer; a sentence extraction layer which is used for selecting sentence vectors, taking obtained sentences as current given text segmentsand encoding the current given text segments; a classification layer which takes each vacancy to be filled as a problem, takes the obtained text segment codes and the word vectors of the four candidate words as input, and calculates an output probability through a multi-feature classification network; a prediction layer which is used for normalizing the probability value obtained by the upper layer and the probability value of the language model to obtain the probabilities of the four final options; And an output layer which is used for calculating the cross entropy of the probability and theactual probability obtained by the previous layer, optimizing the classification network and updating the parameters of the network by taking the loss value as a delay reward.
Owner:SUN YAT SEN UNIV

Document summary extracting method based on data reconstruction

The invention discloses a document summary extracting method based on data reconstruction. The document summary extracting method comprises the steps of: obtaining a document from a document databank to be used as an objective document, wherein the summary of the objective document is to be extracted; aiming at each objective document, extracting all sentences of the document to be used as a standby sentence library of the summary of the document; counting the weight information of all keywords in all documents, and expressing each sentence in the standby sentence library into a vector; selecting optimal summary sentences which both contain the main idea of the document and contain the less redundant information from the standby sentence library according to a data reconstruction algorithm; and extracting the selected sentences to form the summary of the objective document. The method has the advantages that a user, particularly the disabled users with visual disturbance, can be helped to understand the main content of the original document rapidly in a mode that the summary contains fewer words.
Owner:ZHEJIANG UNIV

Method for intelligently analyzing Chinese character emotional tendency through computer

The invention discloses a method for intelligently analyzing a Chinese character emotional tendency through a computer. The method is characterized by comprising reading Chinese character paragraph files, segmenting the Chinese character paragraph files and performing word segmentation, part-of-speech tagging and syntactic interdependent relationship marking on segmentations to form extensible markup language (EML) files; reading the EML files, going through sentence extraction syntactic interdependent relationship pairs and assigning extracted words based on a dictionary, wherein a word in a positive word dictionary is assigned with 1, and a word in a negative word dictionary is assigned with -1; degree adverbs are divided into 5 grades according to degrees and assigned with 1.8, 1.5, 1.2, 0.9 and 0.5 respectively; and negative adverbs can be divided into -1 and -1.5 grades according to negative degrees; and going through the dictionary according to a formula: emotional score= negative words* adverb sum* adjectives and obtaining the emotional score of the Chinese character paragraph files; and the emotional tendency of the Chinese character paragraph files is judged according to the emotional score.
Owner:SUZHOU LIANGJIANG TECH

Automatic composition classification method for primary school based on TextRank and convolution neural network

ActiveCN109062958AAvoid vector dimensions that are too highAvoid problems with sparsityNeural architecturesSpecial data processing applicationsInformatizationData set
The invention belongs to the field of educational informatization, an automatic composition classification method for primary school based on TextRank and convolution neural network is provided, firstly, the TextRank-based key sentence extraction model is used to extract key sentences for various compositions to remove redundant semantic information, and then the convolution neural network is usedto extract fixed-length text feature vectors for training classifiers and predicting text categories. The method of the invention uses TextRank algorithm to eliminate redundant information of data set in advance, and reduces interference information of long text compared with other depth learning methods; the feature selection of the method of the invention is automatically completed, and the efficiency is improved compared with the traditional machine learning method.
Owner:HUAZHONG NORMAL UNIV

Machine translation method and apparatus

A machine translation method includes translating a source sentence using a first model, determining a back-translation probability of a translation result of the source sentence being back-translatedinto the source sentence using a second model, applying the back-translation probability to context information extracted from the source sentence in the first model, and retranslating the source sentence using the first model and the context information to which the back-translation probability is applied.
Owner:SAMSUNG ELECTRONICS CO LTD

Method applied in screenplay for analyzing emotion curves

The invention discloses a method applied in a screenplay for analyzing emotion curves. The method comprises the steps of 1, constructing emotion dictionaries including a positive dictionary, a negative dictionary and a neutral dictionary; 2, conducting scene division and sentence extraction on the screenplay according to scenes to obtain scene statements and character statements; 3, conducting word segmentation on the scene statements and the character statements to obtain scene words and character words; 4, conducting lexical division, word-frequency statistics and weight determination on the scene words and the character words; 5, inputting word frequencies and weight values into a calculation formula to obtain a scene emotion index and a character emotion index; 6, drawing the scene emotion curve and the character emotion curve of the screenplay. The method applied in the screenplay for analyzing the emotion curves has no requirement for working experience and provides convenience for calculation.
Owner:逄泽文玥
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products