Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

1211 results about "Word list" patented technology

System for semantically disambiguating text information

Disclosed is a semantic user interface system that allows text information to be tagged with machine-readable IDs that are associated with concepts for conveying information without any ambiguity or without being hampered by the limitations of human languages. Typically, a plurality of vocabularies are stored across a network, and each vocabulary includes a plurality of machine-readable IDs each corresponding to a concept and at least one keyword corresponding to each machine-readable ID. An input interface accepts text information, selects those machine-readable IDs whose keywords match up with the text information, and returns a list of candidates each corresponding to one of the selected machine-readable IDs and including a corresponding description. The machine-readable IDs can carry information in the form of concepts without any ambiguity as opposed to text information. This system can be applied to web and database searches, publishing messages to selected subscribers, interfacing of applications software, machine translations, etc.
Owner:SARKAR

Systems and Methods for an Automated Personalized Dictionary Generator for Portable Devices

InactiveUS20090306969A1Improvement in candidate qualityReduction in required keystrokeNatural language data processingSpecial data processing applicationsPersonalizationWord list
A system and method for automated dictionary population is provided to facilitate the entry of textual material in dictionaries for enhancing word prediction. The automated dictionary population system is useful in association with a mobile device including at least one dictionary which includes entries. The device receives a communication which is parsed and textual data extracted. The text is compared to the entries of the dictionaries to identify new words. Statistical information for the parsed words, including word usage frequency, recency, or likelihood of use, is generated. Profanities may be processed by identifying profanities, modifying the profanities, and asking the user to provide feedback. Phrases are identified by phrase markers. Lastly, the new words are stored in a supplementary word list as single words or by linking the words of the identified phrases to preserve any phrase relationships. Likewise, the statistical information may be stored.
Owner:ZI CORPORATION OF CANADA INC

Personal message service with enhanced text to speech synthesis

A server in a network gathers textual information, such as news items, E-mail and the like. From that information, the server develops or identifies messages for use by individual subscribers. The same server that accumulates the text messages or another server in the network converts the textual information in each message to a sequence of speech synthesizer instructions. The converted messages, containing the sequences of speech synthesizer instructions, are transmitted to each identified subscriber's terminal device. A synthesizer in the terminal generates an audio waveform signal, representing the speech information, in response to the instructions. In the preferred embodiment, the terminals utilize concatenative type speech synthesizers, each of which has an associated vocabulary of stored fundamental sound samples. The instructions identify the sound samples, in order. The instructions also provide parameters for controlling characteristics of the signal generated during waveform synthesis for each sound sample in each sequence. For example, the instructions may specify the pitch, duration, amplitude, attack envelope and decay envelope for each sample. The division of the text to speech synthesis processing between the server and the terminals places the cost of the front end processing in the server, which is a shared resource. As a result, the hardware and software of the terminal may be relatively simple and inexpensive. Also, it is possible to upgrade the quality of the synthesis by upgrading the server software, without modifying the terminals.
Owner:GOOGLE LLC

Educational spell checker

Method and apparatus for improving a user's ability to spell words correctly are provided. The method comprises: displaying a word list for user selection of a correctly spelled word; and displaying assistance information associated with the correctly spelled word. In one embodiment, the assistance information is selected from: one or more root words, one or more related words, and one or more memorization clues. The apparatus may comprise a signal bearing medium containing instructions of a computer program which, when executed by one or more processors, performs the method of the invention. Another embodiment of the apparatus comprises a computer system comprising one or more processors and memory configured to execute a computer program which, when executed, performs the method of the invention.
Owner:IBM CORP

Contextual prediction of user words and user actions

The invention concerns user entry of information into a system with an input device. A scheme is provided in which an entire word that a user wants to enter is predicted after the user enters a specific symbol, such as a space character. If the user presses an ambiguous key thereafter, rather than accept the prediction, the selection list is reordered. For example, a user enters the phrase “Lets run to school. Better yet, lets drive to “.””” After the user presses the space, after first entering the second occurrence of the word “to,” the system predicts that the user is going to enter the word “school” based on the context in which the user has entered that word in the past. Should the user enter an ambiguous key after the space, then a word list which contains the word “school” is reordered and other options are made available to the user. The invention can also make predictions on context, such as the person to whom the message is sent, the person writing the message, the day of the week, the time of the week, etc. Other embodiments of the invention contemplate anticipation of user actions, as well as words, such as a user action in connection with menu items, or a user action in connection with form filling.
Owner:TEGIC COMM +1

Method and system for learning linguistically valid word pronunciations from acoustic data

A computerized pronunciation system is provided for generating pronunciations for words and storing the pronunciations in a pronunciation dictionary. The system includes a word list including at least one word; transcribed acoustic data including at least one waveform for the word and transcribed text associated with the waveform; a pronunciation-learning module configured to accept as input the word list and the transcribed acoustic data, the pronunciation-learning module including: sets of initial pronunciations of the word, a scoring module configured score pronunciations and to generate phone probabilities, and a set of alternate pronunciations of the word, wherein the set of alternate pronunciations include a highest-scoring set of initial pronunciations with a highest-scoring substitute phone substituted for a lowest-probability phone; and a pronunciation dictionary configured to receive the highest-scoring set of initial pronunciations and the set of alternate pronunciations.
Owner:NUANCE COMM INC

Acronym extraction system and method of identifying acronyms and extracting corresponding expansions from text

An acronym expansion system of the present invention receives electronic documents and extracts acronyms and their corresponding expansions. A part-of-speech tagger decomposes text into string tokens or words and tags them with their part-of-speech, while an acronym identifier determines whether a word is a potential acronym based on various conditions. An expansion identifier retrieves lists of words preceding and following a potential acronym to search for the expansion. The resulting word lists are examined sequentially to identify and retrieve an expansion for the potential acronym. An expansion extractor receives the potential acronym and a processed word list to retrieve the expansion of the potential acronym from that list. The extractor may utilize information from prior search iterations, and verifies an extracted expansion against a set of rules to remove spurious expansions.
Owner:PERATON INC

Method and apparatus utilizing voice input to resolve ambiguous manually entered text input

From a text entry tool, a digital data processing device receives inherently ambiguous user input. Independent of any other user input, the device interprets the received user input against a vocabulary to yield candidates such as words (of which the user input forms the entire word or part such as a root, stem, syllable, affix), or phrases having the user input as one word. The device displays the candidates and applies speech recognition to spoken user input. If the recognized speech comprises one of the candidates, that candidate is selected. If the recognized speech forms an extension of a candidate, the extended candidate is selected. If the recognized speech comprises other input, various other actions are taken.
Owner:TEGIC COMM

Dynamic information object cache approach useful in a vocabulary retrieval system

A concept cache useful in a vocabulary management system stores references to individual information objects that can be retrieved and dynamically assembled into electronic documents. Information objects are organized in one or more hierarchical trees, and references to nodes in the trees are cached. A query processor receives a cache query from a delivery engine that is attempting to dynamically construct an electronic document with content that matches the query. For example, a common Web site query contains a concept and an information type. The cache is searched to identify one or more rows that match the query concept and the query information type. An intersection of the rows is determined, yielding a result set of rows. Index pointers in the rows of the result set lead to stored information objects, which are passed to the delivery engine. The delivery engine assembles the electronic document using the information objects. Unlike past approaches that cache static pages, rapid delivery of dynamic pages is facilitated. Vocabularies and relationships are cached with their references to other objects, as needed, facilitating speed of execution of both the logic of constructing a document and in finding the appropriate cached version of an information object.
Owner:CISCO TECH INC

Method and system for translating user keywords into semantic queries based on a domain vocabulary

The embodiments of the present invention provide a computer-implemented method and system for translating user keywords into semantic queries based on domain vocabulary. The system receives the user keywords and search for the concepts. The concepts are transformed into a connected graph. The user keywords are translated into precise access paths based on the information relationship described in conceptual entity relationship models and then converts these paths into logic based queries. It bridges the semantic gap between user keywords and logic based structured queries. It enables users to interact with the semantic system by articulating the information in a structured query language. It improves the relevance of search results by incorporating semantic technology to drive the mechanics of the search solution.
Owner:INFOSYS LTD

System and method for translating from a source language to at least one target language utilizing a community of contributors

A method and system are provided for translating terms from a source language to a target language utilizing a community of contributors. The source language terms are stored in an active glossary (400), the translation of which may be governed by an administrator. A community of contributors suggests translations for terms in the active glossary (400). A moderator selected by the administrator moderates translation of the terms in the active glossary (400) into the target language. Accordingly, the moderator may, in the exercise of his or her judgment, lock a particular suggested translation, making it the final translation for a term in the source language. Upon satisfaction of some predetermine exit criteria, e.g., a time deadline or completion threshold, the active glossary (400) is locked and all of the final translations for terms in the source glossary selected by the moderator are then stored in a localized glossary (500).
Owner:MICROSOFT TECH LICENSING LLC

Smart training and smart scoring in SD speech recognition system with user defined vocabulary

In a speech training and recognition system, the current invention detects and warns the user about the similar sounding entries to vocabulary and permits entry of such confusingly similar terms which are marked along with the stored similar terms to identify the similar words. In addition, the states in similar words are weighted to apply more emphasis to the differences between similar words than the similarities of such words. Another aspect of the current invention is to use modified scoring algorithm to improve the recognition performance in the case where confusing entries were made to the vocabulary despite the warning. Yet another aspect of the current invention is to detect and warn the user about potential problems with new entries such as short words and two or more word entries with long silence periods in between words. Finally, the current invention also includes alerting the user about the dissimilarity of the multiple tokens of the same vocabulary item in the case of multiple-token training.
Owner:WIAV SOLUTIONS LLC

System of interactive dictionary

System for building one's own interactive dictionary and / or thesaurus of words, terms, phrases, etc. in one or more languages. It being accepted that one may want to build such data based on one's personal interest of vocabulary in one or more languages by well-defined classifications that uses a computer.
Owner:GORADIA GAUTAM DHARAMDAS

System and method of providing autocomplete recommended word which interoperate with plurality of languages

A system and method of providing an autocomplete recommended word, which classify a recommended word list according to indexes of various languages, store the recommended word list for each index, extract a corresponding autocomplete recommended word according to a user query and a setting mode which is received from a user's web browser, provide the user with the corresponding autocomplete recommended word, and thereby may propose a suitable recommended word according to the user query.
Owner:NHN CORP

Method and apparatus utilizing voice input to resolve ambiguous manually entered text input

From a text entry tool, a digital data processing device receives inherently ambiguous user input. Independent of any other user input, the device interprets the received user input against a vocabulary to yield candidates such as words (of which the user input forms the entire word or part such as a root, stem, syllable, affix), or phrases having the user input as one word. The device displays the candidates and applies speech recognition to spoken user input. If the recognized speech comprises one of the candidates, that candidate is selected. If the recognized speech forms an extension of a candidate, the extended candidate is selected. If the recognized speech comprises other input, various other actions are taken.
Owner:TEGIC COMM

Method, system, and computer program product for storing, managing and using knowledge expressible as, and organized in accordance with, a natural language

A method, system, and computer program product for storing and managing a knowledge profile are described. The knowledge is stored in knowledge units representative of unconstrained natural language (NL). Any given knowledge unit is associatable with at least one other knowledge unit with the given knowledge unit being a context knowledge unit, and the at least one other knowledge unit being a detail knowledge unit of the associated context knowledge unit, and such that every given context knowledge unit that has at least one associated detail knowledge unit satisfies a NL relationship there-between that corresponds to one of the NL-expressible forms of the NL word "have". The profile includes a core set of knowledge units for a core vocabulary of words, at least some of which are associated with knowledge units to provide a basic meaning of the associated words. The profile further includes a core set of knowledge units for core processing and core parsing NL-expressible knowledge. The knowledge units are arranged in accordance with a predefined structure that reflects context-detail relationships and that is dynamically extensible to include other knowledge units during run-time; and the placement and relationships of knowledge units within the predefined structure further reflect semantic interpretations of the knowledge units and support algorithmic reasoning about the knowledge in the profile. In certain embodiments, the profile includes NL class structures to form knowledge units to represent NL words and phrases, and the profile includes NL word class structures to form knowledge units to represent NL words.
Owner:GENSYM CORP

Word extraction method and system for use in word-breaking

A method, computer readable medium and system are provided which collect new words for addition to a lexicon for an agglutinative language. Sentences in the agglutinative language are retrieved from documents, for example from web pages. New word candidate character strings are identified in the retrieved sentences. The identified new word candidate character strings are filtered using a combination of a plurality of statistical criteria to generate a new words list. Words from the new words list are added to the lexicon.
Owner:MICROSOFT TECH LICENSING LLC

Process for enhancing queries for information retrieval

Enhancing queries for information retrieval that automatically finds the preferred, first ranked matching term usage subject area (“TUSA”) from a prior query. The process automatically finds alternative TUSAs for the prior query, ranked by degree of match or preference, and provides an option to switch among the alternative TUSAs. It is required that a TUSA for the query be passively accepted or actively selected from a presented list based on the prior query. Using means prepared in advance from data sets of messages collected for each TUSA and general vocabulary the process also ranks and presents to the user alternative and additional query terms and phrases reflecting specificity and relevance to the query and the TUSA. Significantly relevant terms and phrases are presented for query refinement and ranked by relevance permitting the user to select and deselect query terms and effect a new search based on the enhanced query.
Owner:JOLLY SEVEN SERIES 70 OF ALLIED SECURITY TRUST I

Method and apparatus for digital media management, retrieval, and collaboration

The software incorporates a glossary management tool that makes it easy for each client to customize terminology to the needs of a particular business. With this tool, termed a glossary manager, a company can customize a number of feature names in the system to provide a more familiar context for their users. A system administrator can also customize the manner in which “thumbnail” or “preview” images are presented. The system performs clustering on search queries, and searches media records multi-modally, using two or more approaches such as image searching and text searching. An administrator can tune search parameters. Two or more streams of metadata may be aligned and correlated with a media file, facilitating later searching. The system evaluates itself. It folds popularity information into rankings of search results.
Owner:BEN GRP INC

Data search method and data search system

The invention discloses a data search method and a data search system. The data search method comprises the following steps: pre-extracting a key word list, obtaining query results corresponding to key words in the key word list through visiting an external data source server, and associatively storing the key words and the query results corresponding to the key words in a cache database; obtaining a search request with a search word sent by a client, distributing the search request to the cache database, searching the key word matched with the search word and the query result corresponding to the key word according to a preset matching rule in the cache database; and sending the query result corresponding to the key word to the client. According to the data search method and the data search system, the problem of excessively long waiting time of a user caused by application of a complex algorithm to completion of a data matching process when an information database and an index database are configured at the same time in the prior art can be solved, and the benefit of quickly searching matched data according to the preset cache database and the matching rule can be obtained.
Owner:BEIJING QIHOO TECH CO LTD +1

Expression input method and apparatus

The invention discloses an emoticon input method and device. The method includes the steps of: determining a word input by a user, and listing preset emoticons corresponding to the word; receiving an expression selection instruction, determining the expression selected by the user and outputting it. Through the emoticon input method and device of the present invention, emoticons can be input conveniently and quickly, and the emoticons corresponding to the words can be determined according to the words and expressions correspondence setting instructions input by the user, or the emoticons corresponding to the words can be determined according to the user's input records , so as to directly input an expression that meets the needs of the user or the user's habits.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Implementation method for fusing network question and answer system based on multi-attention mechanism

The invention discloses an implementation method of a fusion network question and answer system based on a multi-attention mechanism, which comprises the following steps of constructing a question andanswer system network model, preprocessing an original data set to obtain a standby data set, and performing text length distribution analysis; subjecting text in standby data set to one-hot vector representation, using a CBOW model to train one-hot word vector and forming a word2vec word list; adjusting the sequence length of each sentence in the text, and adding a sentence end mark; training the word2vec vector by using an ELMO language model to obtain an ELMO word vector; encoding the ELMO vector to obtain a sentence vector; performing coarse-fine granularity attention on the sentence vectors respectively to obtain memory vectors and attention vectors based on each word; carrying out vector splicing to obtain expression vectors based on words and sentences; and decoding an answer representing the vector generation question sentence. According to the method, the representation ability of sentences is improved through an ELMO language model; and various attention mechanisms are fused, so that the decision making accuracy of the system is improved, and the interpretability of the system is enhanced.
Owner:GUANGDONG UNIV OF TECH

Method and apparatus for dynamic adaptation of a large vocabulary speech recognition system and for use of constraints from a database in a large vocabulary speech recognition system

The lexical network of a large-vocabulary speech recognition system is structured to effectuate the rapid and efficient addition of words to the system's active vocabulary. The lexical network is structured to include Phonetic Constraint Nodes, which organize the inter-word phonetic information in the network, and Word Class Nodes which organize the syntactic semantic information in the network. Network fragments, corresponding to phonetic pronunciations and labeled to specify permitted interconnections to each other and to phonetic constraint nodes, are precompiled to facilitate the rapid generation of pronunciations for new words and thereby enhance the rapid addition of words to the vocabulary even during speech recognition. Functions defined in accordance with linguistic constraints may be utilized during recognition. Different language models and different vocabularies for different portions of a discourse may also be invoked depending, in part, on the discourse history.
Owner:SPEECHWORKS INT

Multi-language-pair neural network machine translation method and system

The invention belongs to the technical field of computer software and discloses a multi-language-pair neural network machine translation method and system. A plurality of bilingual parallel corpora ofa same language system are utilized and mapped to a same high-dimensional vector space after byte pair encoding, so that multiple languages share a same semantic space, the size of a word list is reduced, model parameters are reduced, and convergence of a model is accelerated. Words of a same language family are in the same vector space, more information can be learned mutually, the information which can not be learned through only certain bilingual parallel corpora can be learnt, and the quality of word vectors is improved. The machine translation system can be used for translation in the language direction without direct bilingual parallel corpora, and the translation quality in the scarce parallel corpus translation direction is greatly improved through mutual information learning. Meanwhile, the same model is used for translation for the translation direction low in utilization rate, occupation of a server is reduced, and the utilization rate of the server is increased.
Owner:GLOBAL TONE COMM TECH

Voice wakeup method and voice wakeup device based on artificial intelligence

The invention proposes a voice wakeup method and a voice wakeup device based on artificial intelligence. The voice wakeup method comprises the following steps: acquiring pronunciation information corresponding to a custom wakeup word; acquiring approximate pronunciation information corresponding to the pronunciation information; and building a wakeup word recognition network according to a preset spam word list, the pronunciation information and the approximate pronunciation information so as to recognize voice input by a user according to the wakeup word recognition network and determine whether or not need to perform a wakeup operation according to the recognition result. According to the embodiments of the invention, a wakeup word recognition network can be built dynamically for different custom wakeup, so that the accuracy of wakeup is enhanced effectively, the rate of false alarm is reduced, and the wakeup efficiency is improved. The device has smaller memory footprint and low power consumption.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products