Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

3065 results about "Target text" patented technology

A target text (TT) is a translated text written in the intended target language, which is the result of a translation from a given source text. According to Jeremy Munday's definition of translation, "the process of translation between two different written languages involves the changing of an original written text (the source text or ST) in the original verbal language (the source language or SL) into a written text (the target text or TT) in a different verbal language (the target language or TL)". The terms 'source text' and 'target text' are preferred over 'original' and 'translation' because they do not have the same positive vs. negative value judgment.

Confidence-driven rewriting of source texts for improved translation

A method for rewriting source text includes receiving source text including a source text string in a first natural language. The source text string is translated with a machine translation system to generate a first target text string in a second natural language. A translation confidence for the source text string is computed, based on the first target text string. At least one alternative text string is generated, where possible, in the first natural language by automatically rewriting the source string. Each alternative string is translated to generate a second target text string in the second natural language. A translation confidence is computed for the alternative text string based on the second target string. Based on the computed translation confidences, one of the alternative text strings may be selected as a candidate replacement for the source text string and may be proposed to a user on a graphical user interface.
Owner:XEROX CORP

Method and apparatus for aligning texts

A method and apparatus for aligning texts. The method includes acquiring a target text and a reference text and aligning the target text and the reference text at word level based on phoneme similarity. The method can be applied to automatically archiving a multimedia resource and a method of automatically searching a multimedia resource.
Owner:IBM CORP

Hybrid adaptation of named entity recognition

A machine translation method includes receiving a source text string and identifying any named entities. The identified named entities may be processed to exclude common nouns and function words. Features are extracted from the source text string relating to the identified named entities. Based on the extracted features, a protocol is selected for translating the source text string. A first translation protocol includes forming a reduced source string from the source text string in which the named entity is replaced by a placeholder, translating the reduced source string by machine translation to generate a translated reduced target string, while processing the named entity separately to be incorporated into the translated reduced target string. A second translation protocol includes translating the source text string by machine translation, without replacing the named entity with the placeholder. The target text string produced by the selected protocol is output.
Owner:XEROX CORP

Aural similarity measuring system for text

The aural similarity measuring system and method provides a measure of the aural similarity between a target text (10) and one or more reference texts (11). Both the target text (10) and the reference texts (11) are converted into a string of phonemes (15) and then one or other of the phoneme strings are adjusted (16) so that both are equal in length. The phoneme strings are compared (12) and a score generated representative of the degree of similarity of the two phoneme strings. Finally, where there is a plurality of reference texts the similarity scores for each of the reference texts are ranked (13). With this aural similarity measuring system the analysis is automated thereby reducing risks of errors and omissions. Moreover, the system provides an objective measure of aural similarity enabling consistency of comparison in results and reproducibility of results.
Owner:MONGOOSE VENTURES

Lexical and phrasal feature domain adaptation in statistical machine translation

A translation method is adapted to a domain of interest. The method includes receiving a source text string comprising a sequence of source words in a source language and generating a set of candidate translations of the source text string, each candidate translation comprising a sequence of target words in a target language. An optimal translation is identified from the set of candidate translations as a function of at least one domain-adapted feature computed based on bilingual probabilities and monolingual probabilities. Each bilingual probability is for a source text fragment and a target text fragment of the source text string and candidate translation respectively. The bilingual probabilities are estimated on an out-of-domain parallel corpus that includes source and target strings. The monolingual probabilities for text fragments of one of the source text string and candidate translation are estimated on an in-domain monolingual corpus.
Owner:XEROX CORP

Lexical and phrasal feature domain adaptation in statistical machine translation

A translation method is adapted to a domain of interest. The method includes receiving a source text string comprising a sequence of source words in a source language and generating a set of candidate translations of the source text string, each candidate translation comprising a sequence of target words in a target language. An optimal translation is identified from the set of candidate translations as a function of at least one domain-adapted feature computed based on bilingual probabilities and monolingual probabilities. Each bilingual probability is for a source text fragment and a target text fragment of the source text string and candidate translation respectively. The bilingual probabilities are estimated on an out-of-domain parallel corpus that includes source and target strings. The monolingual probabilities for text fragments of one of the source text string and candidate translation are estimated on an in-domain monolingual corpus.
Owner:XEROX CORP

Synthesis by Generation and Concatenation of Multi-Form Segments

A speech synthesis system and method is described. A speech segment database references speech segments having various different speech representational structures. A speech segment selector selects from the speech segment database a sequence of speech segment candidates corresponding to a target text. A speech segment sequencer generates from the speech segment candidates sequenced speech segments corresponding to the target text. A speech segment synthesizer combines the selected sequenced speech segments to produce a synthesized speech signal output corresponding to the target text.
Owner:CERENCE OPERATING CO

Translation quality quantifying apparatus and method

A system for automating the quality evaluation of a translation. The system may include a computer having a processor and memory device operably connected to one another. A source text in a first language may be stored within the memory device. A target text comprising a translation of the source text into a second language may also be stored within the memory device. Additionally, a plurality of executables may be stored on the memory device and be configured to, when executed by the processor, independently identify a test sample comprising one or more blocks, each comprising a matched set having a source portion selected from the source text and a corresponding target portion selected from the target text.
Owner:MULTILING CORP

Means and Method for Adapted Language Translation

This invention relates to a means and a method for translating source text into a target text where the context information is taken into consideration. A source text unit is defined around a translation unit which is to be translated. This source text unit is mapped onto a bilingual sublanguage space where the bilingual sublanguage space comprises a source sublanguage space and mappings to the target language. The translation is adapted to the source text unit, thereby considering contextual information.
Owner:NAT RES COUNCIL OF CANADA

Hybrid machine translation

A system and method for hybrid machine translation approach is based on a statistical transfer approach using statistical and linguistic features. The system and method may be used to translate from one language into another. The system may include at least one database, a rule based translation module, a statistical translation module and a hybrid machine translation engine. The database(s) store source and target text and rule based language models and statistical language models. The rule based translation module translates source text based on the rule based language models. The statistical translation module translates source text based on the statistical language models. A hybrid machine translation engine, having a maximum entropy algorithm, is coupled to the rule based translation module and the statistical translation module and is capable of translating source text into target text based on the rule based and statistical language models.
Owner:EBAY INC

Machine translation using non-contiguous fragments of text

A machine translation method for translating source text from a first language to target text in a second language includes receiving the source text in the first language and accessing a library of bi-fragments, each of the bi-fragments including a text fragment from the first language and a text fragment from the second language, at least some of the bi-fragments comprising non-contiguous bi-fragments in which at least one of the text fragment from the first language and the text fragment from the second language comprises a non-contiguous fragment.
Owner:XEROX CORP

Method and system for compression indexing and efficient proximity search of text data

A system and method of compression indexing and efficient proximity search of text data permits high speed search featuring ranking the relevance of search results according to closeness of desired terms within each portion of text found. The system includes (a) preparing target text, (b) creating a “compression index ebook”, (c) browsing in a compression index ebook, and (d) searching in a compression index ebook. To create the compression index, the method includes the steps of selecting target text, identifying tokens, such as words and punctuation strings, wherein each of the tokens has a frequency. The frequencies of each token are counted. Tokens are ranked from highest frequency to lowest frequency. The frequencies are compressed. The next step is assigning positions to each token frequency and compressing the positions to form a compression index ebook, which is stored in random access memory to eliminate disk seeks during browsing and searching.
Owner:MARPEX

Chinese entity relation extraction method based on keyword and verb dependency

The invention discloses a Chinese entity relation extraction method based on keyword and verb dependency. Taking large-scale unstructured free text as target text, firstly, the text is segmented and keywords are extracted to form a text keyword thesaurus. Then the text is subjected to sentence segmentation, word segmentation, part-of-speech tagging, named entity recognition, dependency parsing, and entity corpus is constructed by combining named entity thesaurus and keyword thesaurus. According to the characteristics of Chinese sentence structure, syntactic structure and the dependency betweenwords, the entity-relation syntactic rules are constructed from verbs, and then each sentence in the text is matched with the relation syntactic rules. Finally, the relation triple is output and theset of text relation triple is obtained. The invention can make the entity relation extraction of the large-scale Chinese text more effective and more accurate.
Owner:SHANGHAI DATATOM INFORMATION TECH CO LTD

Computer-assisted natural language translation

A computer implemented method of translating source material in a source natural language into a target natural language includes receiving a first data input which is a first part of a sub-segment of a translation of the source material from the source natural language into the target natural language, identifying a selectable target text sub-segment in the target natural language associated with the received first data input, and outputting the selectable target text sub-segment. The selectable target text sub-segment is extracted from a corpus of previously translated text segment pairs, each text segment pair having a source text segment in the source natural language and a corresponding translated text segment in the target natural language.
Owner:SDL LTD

Semantic logic processing method and system

The invention discloses a semantic logic processing method and system. The method comprises the steps of obtaining information to be subjected to semantic analysis; recognizing the information to be subjected to the semantic analysis, and converting the information to be subjected to the semantic analysis into target text information; preprocessing the target text information, generating entity tags corresponding to entity words in the target text information, and adding the entity tags to the target text information to generate first text information; segmenting the first text information to obtain at least one sentence; processing the sentences obtained after segmentation to obtain the intention type, the intention logic relation and the semantic slot value of each sentence; analyzing the semantics of the information to be subjected to the semantic analysis based on the intention type, the intention logic relation and the semantic slot value. The semantic logic processing method and system can improve the precision of semantic understanding and user requirement understanding.
Owner:BEIJING QIYI CENTURY SCI & TECH CO LTD

User intention recognition method and device

The invention discloses a user intention recognition method and device. The method includes the steps that in the process that a first user and a second user conduct conversation through an instant messaging tool, a first conversation text sent to the second user by the first user is received; the target text content to be analyzed is determined according to the first conversation text; first behavior data of the operation behavior executed by the first user and related to the second user are obtained in a user behavior database associated with the instant messaging tool; semantic analysis is conducted on the target text content in combination with the first behavior data, and the user intention recognition result is determined. By means of the user intention recognition method and device, the real intension of users can be recognized more accurately.
Owner:ZHEJIANG TMALL TECH CO LTD

Means and a method for training a statistical machine translation system

Existing statistical machine translation machines presently require the availability of a given source language text and an equivalent target language text and from target text to train a translation system. The invention proposes training a statistical machine translation system, more specifically it proposes a computer means and method for training a statistical machine translation system using unilingual source language information.
Owner:NAT RES COUNCIL OF CANADA

Multi-triplet extraction method based on entity-relation joint extraction model

The invention discloses a multi-triplets extraction method based on the entity relationship joint extraction model, comprises: performing segmentation processing on the target text, and tagging position, type and whether is involved with any relation or not of each word in the sentence; the joint extraction model of the entity relationship is established; the joint extraction model of the entity relationship is trained; the triple extraction is performed according to the joint extraction model of the entity relationship; the tri-part tagging scheme designed by the present invention is in the process of joint extraction of the entity relationship an entity that is not related to the target relationship can be excluded; the multi-triplets extraction method based on the entity relationship joint extraction model can be used to extract multiple triplets, and based on the model of the triplet extraction method of the present invention other models have stronger multi-triplets extraction capabilities.
Owner:NAT UNIV OF DEFENSE TECH

Animation image driving method and device based on artificial intelligence

The embodiment of the invention discloses an animation image driving method based on artificial intelligence. The method comprises the following steps: collecting media data of facial expression changes when a speaker speaks voice, determining a first expression base of a first animation image corresponding to the speaker, and reflecting different expressions of the first animation image through the first expression base. After target text information used for driving the second animation image is determined, acoustic features and target expression parameters corresponding to the target text information are determined according to the target text information, the collected media data and the first expression base. Acoustic features and target expression parameters are used. A second animation image with a second expression base can be driven. According to the method and the device, the second animation image can give out the sound of the target text information spoken by the speaker through acoustic feature simulation, and the facial expression conforming to the due expression of the speaker is made in the sounding process, so that vivid substitution feeling and immersion feeling are brought to the user. The interaction experience of the user and the animation image is improved.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Character recognition system and method based on combination of neural network and attention mechanism

The invention claims to protect a character recognition system and method based on the combination of a neural network and an attention mechanism, the system comprising: a convolution neural network feature extraction module, which is used for spatial feature of character image; The spatial features extracted by the convolution neural network are input to the bi-directional long-short memory network module, and the bi-directional long-short memory network can extract the sequence features of characters. The extracted feature vectors are semantically encoded, and then the attention weights of feature vectors are assigned through the attention mechanism, so that the attention is focused on the feature vectors with higher weights. In the decoding part of the model, the features extracted fromattention and the prediction information of the previous time are used as the inputs of the nested long-short memory network. The purpose of using the long-short memory network is to keep the temporal characteristics of the eigenvectors and make the attention points of the model constantly change with time. In the decoding part, the features extracted from attention and the prediction informationof the previous time are used as the inputs of the nested long-short memory network. The invention can more accurately detect the text area in the natural scene, and has good detection effect on thesmall target text and the text with small tilt angle.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Method of spell-checking search queries

A computer-implemented method for determining whether a target text-string is correctly spelled is provided. The target text-string is compared to a corpus to determine a set of contexts which each include an occurrence of the target text-string. Using heuristics, each context of the set is characterized based on occurrences in the corpus of the target text-string and a reference text-string. Contexts are characterized as including a correct spelling of the target text-string, an incorrect spelling of the reference text-string, or including an indeterminate usage of the target text-string. A likelihood that the target text-string is a misspelling of the reference text-string is computed as a function of the quantity of contexts including a correct spelling of the target text-string and the quantity of contexts including an incorrect spelling of a reference text-string. In one application, the target text-string is received in a search query, the search executed following a spell-check.
Owner:GOOGLE LLC

Automatic question-answer processing method and automatic question-answer system

The invention discloses an automatic question-answer processing method and an automatic question-answer system. The method includes: acquiring question text from question-answer data pairs collected in advance, performing word separation on the question text to obtain the corresponding key words of the question text, and building the index relation between the key words and the question text; whenoptional target question text is received, and performing word separation on the target question text to acquire target key words corresponding to the target text question text; according to the index relation of the key words and the question text, determining key words matched with the target key words, and acquiring the question text having index relation with the key words to serve as the candidate question text; calculating the semantic similarity of the candidate question text and the target question text; determining an answer corresponding to the target question text according to thesemantic similarity. The method has the advantages that the semantic similarity of the target question text and each question text is considered to determine the answer of the target question text, and the accuracy of automatic question-answer processing is increased.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Text subject recommending method and device

The invention discloses a method and a device for the recommendation of a text theme, wherein, the method comprises the steps that: word segmentation is carried out to a target text to obtain target words; the weight of the target words is calculated; the theme keywords of the target text are preferably selected out according to the weight of the target words. The theme keywords obtained on the basis of the method can well predict the theme of the target text; therefore, a user can make judgment to the effectiveness of text contents within very short time according to the theme keywords, thus greatly saving the time cost of the user.
Owner:ALIBABA GRP HLDG LTD

Text line extraction method and device

The invention discloses a text line extraction method and device. The method comprises the following steps of: obtaining a sample; detecting Characters in a document image; forming each candidate character box containing characters; and aggregating the candidate character boxes into one or more target text areas, the target text areas comprising at least one candidate character box, characters inthe at least one candidate character box belonging to at least one text line of the document image, and finally extracting each text line in the target text area. Visibly, the candidate character boxes are aggregated; aggregating the candidate character boxes of the document image into a target text area; According to the text line extraction method and the text line extraction device, each text line is extracted from the target text area, and various rules do not need to be set according to priori knowledge such as colors and sizes to define which candidate character boxes can be combined into the text line, so that the text line extraction method not only improves the accuracy of the extraction result of the text line, but also improves the detection efficiency.
Owner:IFLYTEK CO LTD

Network text segmenting method based on genetic algorithm

The invention discloses a network text segmenting method based on the genetic algorithm, used for segmenting short network texts. The method comprises the following steps of: evaluating a Latent Dirichlet allocation (LDA) model corresponding to a corpus by using a Gibbs sampling method, inferring latent topic information using the model, representing texts by using the latent topic information; then transforming a text-segmenting process into a multi-target optimum process by using a parallel genetic algorithm, and calculating the coherency of segmented units, the divergence among the segmented units and fitness functions by using deeper semantic information; and carrying out the genetic iteration of the text segmenting process, and determining whether the segmenting process terminates based on the similarity among multi-iteration results or the upper limit of iterations to obtain the global optimal solution for segmenting the texts. Therefore, the invention improves the accuracy for segmenting the short network texts.
Owner:NANTONG LONGXIANG ELECTRICAL APPLIANCE EQUIP +1

Entity relationship recognition method and apparatus

The present invention relates to an entity relationship recognition method and apparatus. The method comprises obtaining a statement sequence from a target text in a corpus, and performing named entity recognition and dependency grammar marker on the statement sequence to obtain a marked text sentence; matching and retrieving the marked text sentence on basis of an entity relationship seed to obtain a training example; replacing the entity relationship seed word in the training example with predetermined identification, processing the training example after replacement combined with the named entity recognition and the dependency grammar marker, and generating a candidate rule; fuzzifying the candidate rule to obtain fuzzy rules; determining whether the fuzzy rules comprise a new rule; and retrieving the corpus according to the fuzzy rules to obtain a seed set when the fuzzy rules comprise the new rule, and using the obtained seed set as an entity relationship recognition result. Manual participation can be effectively reduced, dependence on the calibrated corpus is reduced, a new entity relationship can be found timely, and the entity relationship recognition method and apparatus are self-adaptive to entity relationship mining in different fields.
Owner:LETV HLDG BEIJING CO LTD +1

Statistical machine translation adapted to context

This invention relates to a means and a method for translating source text into a target text where the context information is taken into consideration. A source text unit is defined around a translation unit which is to be translated. This source text unit is mapped onto a bilingual sublanguage space where the bilingual sublanguage space comprises a source sublanguage space and mappings to the target language. The translation is adapted to the source text unit, thereby considering contextual information.
Owner:NAT RES COUNCIL OF CANADA

A multi-triple extraction method based on an entity-relationship joint extraction model

The invention discloses a multi-triple extraction method based on an entity-relationship joint extraction model, which is characterized in that the method comprises the following steps: obtaining text, processing target text in clauses, and carrying out position, type and relation mark on each word in a sentence; establishing an entity-relationship joint extraction model; training the entity-relationship joint extraction model; according to entity-relation joint extraction model, carrying out three-tuple extraction. The three-part marking scheme designed by the invention can exclude entities that are not related to the target relationship in the process of entity relationship joint extraction. In addition, the multi-triple extraction method based on the entity-relationship joint extractionmodel can be used for extracting the multi-triple, and the model based on the triple extraction method of the invention has stronger multi-triple extraction ability compared with other models.
Owner:NAT UNIV OF DEFENSE TECH

In-context exact (ICE) matching

Methods, systems and program product are disclosed for determining a matching level of a text lookup segment with a plurality of source texts in a translation memory in terms of context. In particular, the invention determines any exact matches for the lookup segment in the plurality of source texts, and determines, in the case that at least one exact match is determined, that a respective exact match is an in-context exact (ICE) match for the lookup segment in the case that a context of the lookup segment matches that of the respective exact match. The degree of context matching required can be predetermined, and results prioritized. The invention also includes methods, systems and program products for storing a translation pair of source text and target text in a translation memory including context, and the translation memory so formed. The invention ensures that content is translated the same as previously translated content and reduces translator intervention.
Owner:SDL INK
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products