Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

1573 results about "Text processing" patented technology

In computing, the term text processing refers to the theory and practice of automating the creation or manipulation of electronic text. Text usually refers to all the alphanumeric characters specified on the keyboard of the person engaging the practice, but in general text means the abstraction layer immediately above the standard character encoding of the target text. The term processing refers to automated (or mechanized) processing, as opposed to the same manipulation done manually.

Front-end architecture for a multi-lingual text-to-speech system

A text processing system for processing multi-lingual text for a speech synthesizer includes a first language dependent module for performing at least one of text and prosody analysis on a portion of input text comprising a first language. A second language dependent module performs at least one of text and prosody analysis on a second portion of input text comprising a second language. A third module is adapted to receive outputs from the first and second dependent module and performs prosodic and phonetic context abstraction over the outputs based on multi-lingual text.
Owner:MICROSOFT TECH LICENSING LLC

Graph-based ranking algorithms for text processing

The present invention provides a method of processing at least one natural language text using a graph. The method includes determining a plurality of text units based upon the natural language text, associating the plurality of text units with a plurality of graph nodes, and determining at least one connecting relation between at least two of the plurality of text units. The method also includes associating the at least one connecting relation with at least one graph edge connecting at least two of the plurality of graph nodes and determining a plurality of rankings associated with the plurality of graph nodes based upon the at least one graph edge. The method can also include a graphical visualization of at least one important text unit in a natural language text or collection of texts. Methods for word sense disambiguation, keyword extraction, and sentence extraction are also provided.
Owner:NORTH TEXAS UNIV OF

System and method for use in text analysis of documents and records

Methods and systems are provided that enable text in various sections of data records to be separately catalogued, indexed, or vectorized for analysis in a text visualization and mining system. A text processing system receives a plurality of data records, where each data record has one or a plurality of attribute fields associated with the records. The attributes fields containing textual information are identified. The specific textual content of each attribute field is identified. An index is generated that associates the textual content contained in each attribute field with the attribute field containing the textual content. The index is operable for use in text processing. The plurality of data records may be located in a data table and the textual information may be contained within cells of the data table. In another aspect, a plurality of data records is received, where at least some of the data records contain text terms. A first method is applied to weight text terms of the data records in a first manner to aid in distinguishing records from each other in response to selection of the first method. A second method is applied to weight text terms of the data records in a second manner to aid in distinguishing records from each other in response to selection of the second method. A vector is generated to distinguish each of the data records based on the text terms weighted by either the first or second method.
Owner:BATTELLE MEMORIAL INST

Semantically-driven extraction of relations between named entities

A system and method of developing rules for text processing enable retrieval of instances of named entities in a predetermined semantic relation (such as the DATE and PLACE of an EVENT) by extracting patterns from text strings in which attested examples of named entities satisfying the semantic relation occur. The patterns are generalized to form rules which can be added to the existing rules of a syntactic parser and subsequently applied to text to find candidate instances of other named entities in the predetermined semantic relation.
Owner:XEROX CORP

Advanced databasing system for chemical, molecular and cellular biology

The present invention relates to systems and methods for biomedical drug research, addressing major molecular, cell biological and biochemical information management issues within drug discovery and basic biomedical science. The invention allows scientists to enter biological, chemical, and / or molecular data into a central database, analyze the data entries according to entry attributes, and graphically view the results. A group of web-enabled researchers can enter, share and analyze molecular and cellular data and information from the resources using standardized vocabularies and ontologies. This application describes in detail components of the databasing system, including but not limited to annotation modules, reference managers, advanced search algorithms, ontology browsers, molecular network builders, and text processing scripts. Ultimately, the information gathered, viewed, and analyzed by this relational databasing system is relevant to research ranging from basic researchers to advanced research in applied technologies within pharmaceutical development and biotech fields.
Owner:COGNIA CORP

Text-based query expansion and sort method in image retrieval

InactiveCN101901249AGuaranteed a high degree of commonalityImprove accuracySpecial data processing applicationsData setImage retrieval
The invention belongs to the field of multimedia information retrieval and relates to a method for realizing thesaurus-based query expansion and sort in image retrieval. The method comprises a WordNet-based English word semantic similarity metric algorithm, a HowNet-based Chinese word semantic similarity metric algorithm, an expansion rule-based query expansion word selection and optimization algorithm and a retrieval result evaluation and optimization algorithm. In the method, an image search engine is improved by the relevant text processing method and the relevant semantic network dictionary; and the retrieval result is sorted through semantic expansion, user interaction and improved similarity measurement. Compared with the traditional method, the method has the advantages of high accuracy rate, high integrality and low space-time cost. The method has very important significance for performing high-efficiency image retrieval according to image high-layer semantic information and on the basis of a large-scale image data set, and has wide application value in the field of cross-linguistic and cross-media retrieval.
Owner:FUDAN UNIV

Text processing method and device based on ambiguity entity words

The invention provides a text processing method and device based on ambiguity entity words. The method comprises the steps of obtaining the context of a text with ambiguities to be eliminated and at least two candidate entities represented by the text with the ambiguities to be eliminated, generating a semantic vector of the context through a trained word vector model, generating a first entity vector of at least two candidate entities through a trained non-supervision neural network model, calculating the similarity between the context and each candidate entity, and determining a target entity represented in the context of the text with the ambiguities to be eliminated. On the basis of the learned non-supervision neural network model of the entity text semantics and the relationship between the entities, a first entity vector of the generated candidate entities includes the text semantics of the candidate entities and the relation between the entities, entity information of the text with the ambiguities to be eliminated is completely described, the similarity is calculated with the context semantic vectors, the target entity is determined, and the ambiguity eliminating accuracy ofthe text with the ambiguities to be eliminated is improved.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

System and method for text structuring and text generation

The present invention provides a computer-implemented system and method of text processing. The system and method include analyzing selected text units of a digitally coded parsed text file to determine text entities, determine the interconnections between the text entities, test the validity of the text entities, and determine a quantitative measure of the significance of each text entity. A multigranular relational text structure is constructed which incorporates the text entities. Output text is generated from the relational text structure using entity grouping rules. The text file is parsed using a system of natural dividers. The text units are selected from the parsed text file using windowing and scanning. The output text generated conforms to user constraints which can include the volume of output text to be generated, keywords to be reflected in the output text, and the level of generalization of the output text.
Owner:COGNISPHERE

Speech synthetic method based on rhythm character

InactiveCN101000765AEffective syncopationNatural soundSpeech synthesisSyllableStructure analysis
A method for synthesizing voice based on rhythm character includes text processing program formed by text standardizing step, rhythm structure analysis step and language treatment step, synthetic element selecting program formed by element confirming step, matching step, pasting-up step, optimizing and screening step; voice synthesization processing program formed by base frequency outline generating step of phrase unit, base frequency outline generating step of syllable unit and intonation superposing step.
Owner:HEILONGJIANG UNIV

Method and apparatus for processing text and character data

An apparatus and method for processing text or character data are disclosed. A text processing system receives a character input string and determines whether to apply character processing. A non-English language such as Italian can be entered into a processing system such as a computer using a standard English based keyboard such that additional keys for providing accents or other grammatical and punctuation symbols or characters not existing in English are not required. In one mode, text is automatically accented or punctuated without requiring user intervention. In another mode, a user is provided with a list of accent or punctuation choices so that the user may select the optimum accent or punctuation. Text processing of an input may be activated by a text sequence including a possible vowel accent or apostrophe error, and may continue as an input method editor loop in response to repeated actuations of the key associated with the first activation event. When an activator event input is detected, a rules based system is utilized to select a correctly accented and punctuated character. A list of alternative accents and punctuations is optionally displayed, and a user may toggle through the list using the activator event to select a desired character. The display provides information for a level of certainty of a selected character or word.
Owner:CLOANTO CORP

Electronic health record system and method for patient encounter transcription and documentation

A patient encounter documentation and analytics system includes a mobile computing platform and a server-based host platform. A mobile application in tandem with a wireless microphone collects voice signals during a patient-caregiver encounter, transforms the voice signals into audio data files, and uploads the audio data files to the server. A speech recognition software module digitally transcribes the audio data file into text. A text processing module extracts and organizes relevant clinical data based on keyword, key phrase and question / answer analysis. Relevance of words and phrases may be determined in view of, e.g., their presence, frequency and context. A diagnostic decision support module enables the healthcare provider to review the determined clinical information and provide a diagnosis associated with the encounter. A documentation skeleton module extracts diagnosis-specific text components from the transcribed text file and assembles an electronic medical document based on the diagnosis and the diagnosis-specific text components.
Owner:VOICEHIT

Text segmentation with multiple granularity levels

Text processing includes: segmenting received text based on a lexicon of smallest semantic units to obtain medium-grained segmentation results; merging the medium-grained segmentation results to obtain coarse-grained segmentation results, the coarse-grained segmentation results having coarser granularity than the medium-grained segmentation results; looking up in the lexicon of smallest semantic units respective search elements that correspond to segments in the medium-grained segmentation results; and forming fine-grained segmentation results based on the respective search elements, the fine-grained segmentation results having finer granularity than the medium-grained segmentation results.
Owner:ALIBABA GRP HLDG LTD

Mobile telephone having a rotator input device

A mobile telephone has a display (240) and a rotator input device (250) comprising a rotatable element and capable of generating commands for browsing and selecting objects on the display. It also has a wireless telecommunication interface to a mobile telecommunications network. A processing device is coupled to the display, the rotator input device and the wireless telecommunication interface. A text-handling software application is executable by the processing device. The processing device is configured, in a first operating mode, to provide first user input by way of the rotator input device (250), said first user input including a number sequence representative of a desired telephone number which is to be reached over the mobile telecommunications network, and to use said first user input when establishing a telephone call connection through the wireless telecommunication interface. Moreover, the processing device is configured, in a second operating mode, to provide second user input by way of the rotator input device, said second user input including a character sequence representative of a desired text, and to forward said second user input to the text-handling software application No numeric or alphanumeric character keyboard is involved in neither of the first and second operating modes.
Owner:NOKIA CORP

Evaluation method and apparatus based on text analysis, and storage medium

Aspects of the disclosure provide an information processing apparatus that includes interface circuitry and processing circuitry. The interface circuitry is configured to obtain a text authored by a person. The processing circuitry is configured to analyze the text to obtain measurements of language features of the person, input the measurements of the language features into an evaluation model that is trained to predict a score as a function of the language features, determine a specific score for the person based on the evaluation model and output the specific score of the person for predicting a behavior of the person.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Method and apparatus for processing text and character data

Methods and systems for processing text or character data are disclosed. A text processing system receives a character input string and determines whether to apply character processing. A non-English language such as Italian can be entered into a processing system such as a computer using a standard English based keyboard such that additional keys for providing accents or other grammatical and punctuation symbols or characters not existing in English are unnecessary. In one mode, text is automatically accented or punctuated without requiring user intervention. In another mode, a user is provided with a list of accent or punctuation choices so that the user may select the optimum accent or punctuation. Text processing of an input may be activated by a predefined activator key pressed in a predetermined sequence, or may be activated in the event a predetermined sequence of characters is received.
Owner:CLOANTO CORP

Method for text error correction after voice recognition based on domain identification

The invention belongs to the field of voice recognition text processing and discloses a method for text error correction after voice recognition based on domain identification and aims at solving theproblem that a processing method in the prior art needs lots of labor for intervention, is low in error correction efficiency and cannot conduct error correction on proper names. The method comprisesthe following steps that (a) error knowing and analysis are conducted on texts obtained after voice recognition, and the field which text sentences belong to are primarily determined; (b) sentences toundergo error correction are segmented according to predefined syntax rules and are divided into redundancy portions and core portions; (c) a search engine is utilized to perform character string fuzzy matching and determine candidate specific word bank sets of the core portions of the sentences; (d) similarity scores are calculated according editing distances, and error correction is conducted on the redundancy portions and the core portions; (e) the redundancy portions and core portions undergoing the error correction are fused, and then error correction results are output.
Owner:SICHUAN CHANGHONG ELECTRIC CO LTD

Caption producing system and caption producing method

The invention relates to a caption producing system and a caption producing method based on a video-audio file. The system comprises a video-audio file input unit, a video-audio file processing unit, a text input unit, a text processing unit and a caption processing unit, wherein the video-audio file input unit is used for storing or receiving video-audio files provided by the external; the video-audio file processing unit is used for receiving the video-audio files from the video-audio file input unit and extracting video-audio information from the video-audio files; the text input unit is used for storing or receiving a caption text provided by the external; the text processing unit is used for receiving the caption text from the text input unit, and adjusting the attribute of the caption text so as to generate text attribute information related with caption; and the caption processing unit is used for generating a caption project file based on the video-audio information, the caption text and the text attribute information. The caption producing system and the caption producing method can remarkably improve the flexibility and convenience of caption production and modification and language transformation processes, and realize full compatibility of various film and television work formats and related caption formats simultaneously.
Owner:株式会社康巴思

Speech synthetic text processing method based on rhythm structure

A method for processing voice synthetic text based on rhythm structure includes comparing inputted text with preset special symbol table to output legal pronunciation character string, comparing legal pronunciation character string according to participle rule and rhythm structure analysis rule to output labeled character string with rhythm structure information, comparing labeled character string with preset rhythm rule and phonetic table word by work and outputting label phonetic code string labeled rhythm information.
Owner:HEILONGJIANG UNIV

Mobile telephone having a rotator input device

A mobile telephone has a display (240) and a rotator input device (250) comprising a rotatable element and capable of generating commands for browsing and selecting objects on the display. It also has a wireless telecommunication interface to a mobile telecommunications network. A processing device is coupled to the display, the rotator input device and the wireless telecommunication interface. A text-handling software application is executable by the processing device. The processing device is configured, in a first operating mode, to provide first user input by way of the rotator input device (250), said first user input including a number sequence representative of a desired telephone number which is to be reached over the mobile telecommunications network, and to use said first user input when establishing a telephone call connection through the wireless telecommunication interface. Moreover, the processing device is configured, in a second operating mode, to provide second user input by way of the rotator input device, said second user input including a character sequence representative of a desired text, and to forward said second user input to the text-handling software application No numeric or alphanumeric character keyboard is involved in neither of the first and second operating modes.
Owner:NOKIA CORP

Method and device for automatically labeling text

The invention discloses a method and a device for automatically labeling a text. The method for automatically labeling the text comprises the following steps of identifying vocabularies in the text; labeling identified vocabularies expressing attribute values into formats corresponding to the types which attribute values belong to in a knowledge base; labeling identified notional words into notional knowledge in the knowledge base; on the basis of a result of labeling the notional words, labeling identified pronouns into contents referred to by the pronouns; and on the basis of results of labeling the notional words and the pronouns, labeling identified attribute names into corresponding attribute names in the knowledge base. In the method for automatically labeling the text, which is disclosed by the embodiment of the invention, text is automatically labeled according to the notional knowledge in the knowledge base and the notional knowledge in the knowledge base is deeply integrated, so as to introduce massive structured information in the knowledge base into conventional text processing application and implement reasoning and expansion between the text and the notional knowledge, thereby expanding a very wide application prospect.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Text processing method and device based on voice identification

The embodiment of the invention provides a text processing method and a device based on voice identification. The method comprises a step of obtaining a first text which is obtained by the voice identification of voice data, a step of punctuating the first text, and obtaining one or more text segments, and a step of adding punctuations to the one or more text segments, and forming a second text through combination. According to the embodiment of the invention, the automatic adding of the punctuations is realized, the manual positioning and punctuation adding of a user are avoided, and the convenience of voice input is improved greatly.
Owner:BEIJING QIHOO TECH CO LTD +1

Text processing method, model training method and device

The invention relates to the field of artificial intelligence, and provides a text processing method and device and a model training method and device, and the method comprises the steps: obtaining target knowledge data which comprises a first noun entity, a second noun entity and a contact between the first noun entity and the second noun entity; processing the target knowledge data to obtain a target knowledge vector; processing a to-be-processed text to obtain a target text vector, the to-be-processed text comprising the first noun entity; fusing the target text vector and the target knowledge vector according to a target fusion model to obtain a fused target text vector and a fused target knowledge vector; and processing the fused target text vector and / or the fused target knowledge vector according to a target processing model to obtain a processing result corresponding to the target task. According to the technical scheme, the accuracy of the target task processing result by the target processing model can be improved.
Owner:HUAWEI TECH CO LTD +1

Text processing system and methods for automated topic discovery, content tagging, categorization, and search

InactiveUS9483532B1Easy to findEfficient and accurate and scalableWeb data indexingSemantic analysisSemantic propertyPart of speech
A computer system and methods are disclosed for automatically discovering topics and building a hierarchical topic structure, and for tagging and categorizing contents in a document or other natural language contents. The disclosed methods include steps for obtaining terms that best represent the topics in a text content, and building a hierarchical representation of topics of different levels or topic-comment relationships, and folder-subfolder structures. The methods further include obtaining, identifying, and selecting terms representing different degrees of informational importance based on the grammatical roles, parts of speech, and semantic attributes associated with the terms, using the terms to represent topics in the document, to automatically tag the document, to rank search results, and to build a category structure based on the selected terms.
Owner:LINFO IP LLC

System, method, and program for processing text using object coreference technology

ActiveUS20110295594A1Easy to understandAutomatic and comprehensive and accurate and efficient analysis and processingSemantic analysisOffice automationComputer scienceCoreference
System, method and program product for text processing using object coreference technology. In particular, the invention provides a text processing method which includes, acquiring text to be processed; extracting subject words and entity words corresponding to the subject words from the text; grouping the subject words; determining entity words that reference a same concerned object according to the grouped subject words; and generating processing policy for entity words that reference a same concerned object. The invention also includes a system with means for carrying out the method. The invention generally realizes automatic, more comprehensive, accurate, efficient analysis and processing on text data. The invention can be used to dig a large amount of comment data about some entity, and the invention can also be used to suggest insertion place in an article where embedded advertisement is inserted.
Owner:IBM CORP

N-Gram participle model-based reverse neural network junk mail filter device

The invention relates to the technical field of text processing, in particular to an N-Gram participle model-based reverse neural network junk mail filter device. Customized word characteristic items are added to mail particles by using N-Gram technology, and judgment and filter of junk mails are implemented by combining a reverse neural network. The device is implemented by the following steps of: firstly, processing the mails by using a Markov chain and an N-Gram technique, extracting mail sample characteristics, and obtaining a sample mail word-document space by weight calculation and characteristic selection; secondly, matching a mail sample by using the customized word characteristic items to generate a customized characteristic-document space, and combining the document characteristics generated by the two methods to generate a new mail vector space; thirdly, constructing a reverse neural network model, generating characteristic vectors corresponding to network neurons according to the characteristic items of a mail training sample space, and training the network model by using the mail training sample vector space to obtain a trained mail classifier; and finally, generating a test sample vector space by the mail test sample according to the generated characteristic vectors corresponding to the network neurons, and testing the mail type judgment accuracy of the trained mail classifier. The embodiment of the invention can judge the junk mails so as to filter the junk mails.
Owner:UNIV OF ELECTRONICS SCI & TECH OF CHINA

Method for automatically tagging animation scenes for matching through comprehensively utilizing overall color feature and local invariant features

The invention discloses a method for automatically tagging animation scenes for matching through comprehensively utilizing an overall color feature and local invariant features, which aims to improve the tagging accuracy and tagging speed of animation scenes through comprehensively utilizing overall color features and color-invariant-based local invariant features. The technical scheme is as follows: preprocessing a target image (namely, an image to be tagged), calculating an overall color similarity between the target image and images in an animation scene image library, and carrying out color feature filtering on the obtained result; after color feature filtering, extracting a matching image result and the colored scale invariant feature transform (CSIFT) feature of the target image, and calculating an overall color similarity and local color similarities between the matching image result and the CSIFT feature; fusing the overall color similarity and the local color similarities so as to obtain a final total similarity; and carrying out text processing and combination on the tagging information of the images in the matching result so as to obtain the final tagging information of the target image. By using the method provided by the invention, the matching accuracy and matching speed of an animation scene can be improved.
Owner:NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products