Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

1436 results about "Part of speech" patented technology

In traditional grammar, a part of speech (abbreviated form: PoS or POS) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are assigned to the same part of speech generally display similar syntactic behavior—they play similar roles within the grammatical structure of sentences—and sometimes similar morphology in that they undergo inflection for similar properties.

Ontology-based parser for natural language processing

An ontology-based parser incorporates both a system and method for converting natural-language text into predicate-argument format that can be easily used by a variety of applications, including search engines, summarization applications, categorization applications, and word processors. The ontology-based parser contains functional components for receiving documents in a plurality of formats, tokenizing them into instances of concepts from an ontology, and assembling the resulting concepts into predicates. The ontological parser has two major functional elements, a sentence lexer and a parser. The sentence lexer takes a sentence and converts it into a sequence of ontological entities that are tagged with part-of-speech information. The parser converts the sequence of ontological entities into predicate structures using a two-stage process that analyzes the grammatical structure of the sentence, and then applies rules to it that bind arguments into predicates.
Owner:LEIDOS

Method and system for generating a document representation

A method, system and computer program product for generating a document representation are disclosed. The system includes a server and a client computer, and the method involves: receiving into memory a resource containing at least one sentence of text; producing a tree comprising tree elements indicating parts-of-speech and grammatical relations between the tree elements; producing semantic structures each having three tree elements to represent a simple clause (subject-predicate-object); and storing a semantic network of semantic structures and connections therebetween. The semantic network may be created from a user provided root concept. Output representations include concept maps, facts listings, text summaries, tag clouds, indices; and an annotated text. The system interactively modifies semantic networks in response to user feedback, and produces personal semantic networks and document use histories.
Owner:IFWE

Apparatus for providing voice dialogue service and method of operating the same

A speech dialogue service apparatus including: a language analysis module tagging a part of speech (POS) of each respective word included in a sentence recorded in a predetermined text, syntactically analyzing the sentence by classifying a meaning of each respective word, and generating at least one semantic frame corresponding to the sentence according to a result of the syntactical analysis; and a dialogue management module analyzing an intention of the sentence corresponding to the at least one respective semantic frame, and generating a system response corresponding to the sentence intention by selecting a predetermined sentence intention according to whether an action corresponding to the intention of the respective sentence can be performed.
Owner:SAMSUNG ELECTRONICS CO LTD

Handwriting and voice input with automatic correction

A hybrid approach to improve handwriting recognition and voice recognition in data process systems is disclosed. In one embodiment, a front end is used to recognize strokes, characters and / or phonemes. The front end returns candidates with relative or absolute probabilities of matching to the input. Based on linguistic characteristics of the language, e.g. alphabetical or ideographic language for the words being entered, e.g. frequency of words and phrases being used, likely part of speech of the word entered, the morphology of the language, or the context in which the word is entered), a back end combines the candidates determined by the front end from inputs for words to match with known words and the probabilities of the use of such words in the current context.
Owner:TEGIC COMM

System, method, program product, and networking use for recognizing words and their parts of speech in one or more natural languages

A system, method, and computer program are disclosed for recognizing one or more words not listed in a dictionary database. One or more sequences of characters in the word are checked to determine a probability that the word is valid. A prefix removal process removes any prefixes from a word, and obtains information about the removed prefix. A suffix removal process removes any suffixes from the word, and obtains information about the removed suffix. A root process obtains information about a root word from the dictionary database. A combination process then determines if the prefix, the root, and the suffix can be combined into a valid word as defined by one or more combination rules, obtains one or more of the possible parts of speech of the valid word, and stores the parts of speech with the valid word in the dictionary database.
Owner:IBM CORP

Method and system for the automatic recognition of deceptive language

A system for identifying deception within a text includes a processor for receiving and processing a text file. The processor includes a deception indicator tag analyzer for inserting into the text file at least one deception indicator tag that identifies a potentially deceptive word or phrase within the text file, and an interpreter for interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive word or phrases within the text file and generating deception likelihood data based upon the density or distribution of potentially deceptive word or phrases within the text file. A method for identifying deception within a text includes the steps of receiving a first text to be analyzed, normalizing the first text to produce a normalized text, inserting into the normalized text at least one part-of-speech tag that identifies a part of speech of a word associated with the part-of-speech tag, inserting into the normalized text at least one syntactic label that identifies a linguistic construction of one or more words associated with the syntactic label, inserting into the normalized text at least one deception indicator tag that identifies a potentially deceptive word or phrase within the normalized text, interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive word or phrases within the normalized text, and generating deception likelihood data based upon the density or frequency of distribution of potentially deceptive word or phrases within the normalized text.
Owner:DECEPTION DISCOVERY TECH

Methods and systems for e-mail topic classification

A method for processing e-mails includes receiving a plurality of e-mails. For each e-mail in the plurality of e-mails, a feature representation is generated for an e-mail based on a set of noun phrases associated with the e-mail. A set of topics associated with the plurality of e-mails is generated based on the feature representation for each e-mail. Sentence structure associated with the e-mail and parts of speech associated with the e-mail may be determined. The parts of speech, including a set of noun phrases associated with the e-mail, may be used to generate the feature representation for the e-mail.
Owner:VERITAS TECH

Part-of-speech tagging using latent analogy

Methods and apparatuses to assign part-of-speech tags to words are described. An input sequence of words is received. A global fabric of a corpus having training sequences of words may be analyzed in a vector space. A global semantic information associated with the input sequence of words may be extracted based on the analyzing. A part-of-speech tag may be assigned to a word of the input sequence based on POS tags from pertinent words in relevant training sequences identified using the global semantic information. The input sequence may be mapped into a vector space. A neighborhood associated with the input sequence may be formed in the vector space wherein the neighborhood represents one or more training sequences that are globally relevant to the input sequence.
Owner:APPLE INC

Linguistically-adapted structural query annotation

A system and method for natural language processing of queries are provided. A lexicon includes text elements that are recognized as being a proper noun when capitalized. A natural language query includes a sequence of text elements including words. The query is processed. The processing includes a preprocessing step, in which part of speech features are assigned to the text elements in the query. This includes identifying, from a lexicon, a text element in the query which starts with a lowercase letter and assigning recapitalization information to the text element in the query, based on the lexicon. This information includes a part of speech feature of the capitalized form of the text element. Then parts of speech for the text elements in the query are disambiguated, which includes applying rules for recapitalizing text elements based on the recapitalization information.
Owner:XEROX CORP

Grammer checker

A method for parsing a computerized text, the method including preparing a set of logical rules, using logical grammatical links, for parsing a text, using the logical rules to identify a part of speech of each word of text and all links between the words in the text, and labeling the links as grammatically correct links or grammatically incorrect links for correction, so as to parse substantially every word in the text.
Owner:GADOR DEBORAH ADV +1

Acronym extraction system and method of identifying acronyms and extracting corresponding expansions from text

An acronym expansion system of the present invention receives electronic documents and extracts acronyms and their corresponding expansions. A part-of-speech tagger decomposes text into string tokens or words and tags them with their part-of-speech, while an acronym identifier determines whether a word is a potential acronym based on various conditions. An expansion identifier retrieves lists of words preceding and following a potential acronym to search for the expansion. The resulting word lists are examined sequentially to identify and retrieve an expansion for the potential acronym. An expansion extractor receives the potential acronym and a processed word list to retrieve the expansion of the potential acronym from that list. The extractor may utilize information from prior search iterations, and verifies an extracted expansion against a set of rules to remove spurious expansions.
Owner:PERATON INC

Method and computer system for part-of-speech tagging of incomplete sentences

The invention relates to a method and a computer system for enhanced part-of-speech (POS-) tagging as well as grammatically disambiguating a phrase. A phrase is usually a short multiword expression that may be ambiguous. By introducing grammatical constraints the invention supports POS-tagging as well as grammatically disambiguating the phrase. According to an identifier for the phrase, the phrase is supplemented with artificial context information. The supplemented phrase is then POS-tagged or grammatically disambiguated. Important applications are POS-tagging, Automatic Term Encoding, Headword Detection and Information Retrieval.
Owner:XEROX CORP

Sentiment analysis method oriented to micro-blog short text

The invention discloses a sentiment analysis method oriented to a micro-blog short text. The method comprises the following steps: step 1, collecting micro-blog data including keywords so as to store in a database; step 2, pre-processing the micro-blog data; step 3, loading associated dictionaries; step 4, processing sentence division and filtering sentences which do not include user configuration keywords; step 5, processing word division to the sentences including the keywords and labeling parts of speech; step 6, processing dependency sentence structure analysis to the sentences including subjects by a sentence structure analyzing tool; step 7, judging the polarity of each sentence including subject words; and step 8, judging the polarity of a whole micro-blog after judging the polarities of all sentences including the subject words. According to the sentiment analysis method provided by the invention, sentiment analysis is more specific, so that users can know sentiment attitude of concerned aspects from the micro-blog.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Method and apparatus for dynamic adaptation of a large vocabulary speech recognition system and for use of constraints from a database in a large vocabulary speech recognition system

The lexical network of a large-vocabulary speech recognition system is structured to effectuate the rapid and efficient addition of words to the system's active vocabulary. The lexical network is structured to include Phonetic Constraint Nodes, which organize the inter-word phonetic information in the network, and Word Class Nodes which organize the syntactic semantic information in the network. Network fragments, corresponding to phoneme pronunciations and labeled to specify permitted interconnections to each other and to phonetic constraint nodes, are precompiled to facilitate the rapid generation of pronunciations for new words and thereby enhance the rapid addition of words to the vocabulary even during speech recognition. Functions defined in accordance with linguistic constraints may be utilized during recognition. Different language models and different vocabularies for different portions of a discourse may also be invoked depending, in part, on the discourse history.
Owner:SPEECHWORKS INT

Natural language processor

Disclosed is a method for converting a plurality of words or sign language gestures into one or more sentences. The method involves the steps of: obtaining a plurality of words; assigning a part of speech tag to each of said words; assigning a sentence structure tag to said plurality of words; and parsing said words into one or more sentences based on a predefined sentence structure. The method can be implemented by a computer to provide a translator that more accurately reflects the natural language of the original text.
Owner:SOSCHEN ALONA

Labeling of work of art titles in text for natural language processing

A parser for parsing text includes a tokenizing module which divides the text into an ordered sequence of linguistic tokens. A morphological module associates parts of speech with the linguistic tokens. A detection module identifies candidate titles of creative works, such as works of art. A filtering module filters the candidate titles of works to exclude citations of direct speech from the candidate titles of works. A comparison module compares any remaining candidate titles of works with titles of works in an associated knowledge base. The comparison module annotates the text when a match is found.
Owner:XEROX CORP

Dictionary learning method and device using the same, input method and user terminal device using the same

This invention provides a dictionary learning method, said method comprising the steps of: learning a lexicon and a Statistical Language Model from an untagged corpus; integrating the lexicon, the Statistical Language Mode and subsidiary word encoding information into a small size dictionary. And this invention also provides an input method on a user terminal device using the dictionary with Part-of-Speech information and a Part-of-Speech Bi-gram Model added, and a user terminal device using the same. Therefore, sentence level prediction and word level prediction can be given by the user terminal device and the input is speeded up by using the dictionary which is searched by a Patricia Tree index of a dictionary index.
Owner:NEC (CHINA) CO LTD

Deep Neural Network Model for Processing Data Through Mutliple Linguistic Task Hiearchies

ActiveUS20180121788A1Semantic analysisSpeech recognitionHigh level modelPart of speech
The technology disclosed provides a so-called “joint many-task neural network model” to solve a variety of increasingly complex natural language processing (NLP) tasks using growing depth of layers in a single end-to-end model. The model is successively trained by considering linguistic hierarchies, directly connecting word representations to all model layers, explicitly using predictions in lower tasks, and applying a so-called “successive regularization” technique to prevent catastrophic forgetting. Three examples of lower level model layers are part-of-speech (POS) tagging layer, chunking layer, and dependency parsing layer. Two examples of higher level model layers are semantic relatedness layer and textual entailment layer. The model achieves the state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment.
Owner:SALESFORCE COM INC

Automated creation and delivery of database content

A method and apparatus automatically build a database by automatically assigning links to an expert, pushing content to an expert, providing expert annotation, and linking the content to an annotation database. A term is selected by applying rules, such as, the term not previously existing in the database, an unusually high frequency of the term, the term is an article or the term is an unusual part of speech. An advertiser can sponsor the term, for example, by having a banner ad automatically pop-up on a keyword search. Content windows can be attached to the term, the content window containing information such as definitions, related products or services, sponsorship information, information from content syndicators, translations and reference works.
Owner:SENTIUS INT

System and method for creating custom specific text and emotive content message response templates for textual communications

Aspects of the invention present a system and methods to analyze textual communications, which include cognitive as well as emotive content. Generally accepted communication models and principles are used to create communication unique response templates which responders can use to complete a comprehensive, thoughtful, emotively validating response message. An embodiment is comprised of parsing and tokenizing the textual communication into the parts of speech, selecting subject matter from the tokenized parts of speech, fetching starter sentence string fragments from pre-stored data structures, concatenating the fragment strings with selected subject matter or emotive content into grammatical response sentence fragment strings, opening a response file or output device into which synthesized starter response sentence string fragments are written, such that the application systematically processes the complete textual communication and synthesizes communication unique response templates populated with response sentence string fragments in accordance with effective communication principles.
Owner:XENOGENIC DEV LLC

System and method of textual information analytics

This invention provides a method and system for analyzing and deriving analytical insights from textual information. The information structuration process determines the structure in which the text information is rendered. A cyclical extraction process using the parameters of co-frequency, co-dependency, and co-relatedness and parts of speech, determines various textual aspects and their subcomponents such as themes and dimensions. Using the subcomponents of the textual aspects, relationship maps are built, disambiguated and ranked. A text analytics and decision support matrix is created using the ranked relations, thereby providing a highly relevant result set to the user's information need. A multidimensional navigation matrix is created that helps a user navigate across dimensions.
Owner:TEXTUAL ANLYTICS SOLUTIONS PVT

Representing a document using a semantic structure

A method, system and computer program product for generating a document representation are disclosed. The system includes a server and a client computer, and the method involves: receiving into memory a resource containing at least one sentence of text; producing a tree comprising tree elements indicating parts-of-speech and grammatical relations between the tree elements; producing semantic structures each having three tree elements to represent a simple clause (subject-predicate-object); and storing a semantic network of semantic structures and connections therebetween. The semantic network may be created from a user provided root concept. Output representations include concept maps, facts listings, text summaries, tag clouds, indices; and an annotated text. The system interactively modifies semantic networks in response to user feedback, and produces personal semantic networks and document use histories.
Owner:IFWE

Automated topic discovery in documents and content categorization

ActiveUS9047283B1Easy to findEfficient and accurate and scalableWeb data indexingSemantic analysisSemantic propertyPart of speech
A computer-assisted method for discovering topics and categorizing contents in a document includes the steps of calculating an importance score for a term based on grammatical roles, parts of speech, and semantic attributes, selecting terms based on the importance score values of the respective terms, and outputting terms comprising the selected term to represent topics in the document, and building a category structure based on the selected terms.
Owner:LINFO IP LLC

Semantic document profiling

A method of semantic profiling of documents comprises receiving a document to be profiled, the document comprising a plurality of terms, for each of at least a portion of the plurality of terms in the document determining a part of speech and a grammatical function of the term, obtaining senses of the term, selecting a sense as a most likely meaning of the term, and calculating an information value of the term, and generating a semantic profile of the document comprising at least some of the calculated information values.
Owner:GRAMMARLY

News keyword abstraction method based on word frequency and multi-component grammar

A method to extract new keywords based on word frequency and multiple grammars is provided, which belongs to the technology field of a natural language processing, and is characterized by extracting the potential models of part of speech of the multiple grammars of the keywords by researching characteristic part of speech of the keywords and adopting computer to assist excavation and taking the models as the basis of the keywords to extract arithmetic. When extracting the new keywords, firstly excavating the multiple phrases in text in accordance with the potential models of part of speech and extract candidate word set of the keywords, and then excavating potential keywords not loading from titles and add the potential keywords to the candidate keyword set. The application brings forward an improved single text word frequency / inverse text frequency value (tf / idf) format, introduces target-oriented characteristics, grades the candidate keywords, obtains the order of the candidate keywords and gives the keywords of news document after optimizing the results. Compared with the traditional keyword extraction method based on single text word frequency / inverse text frequency value (tf / idf), the method has higher recall rate under the condition of the same precision.
Owner:TSINGHUA UNIV

Handwriting and voice input with automatic correction

InactiveUS20050192802A1Improve handwriting recognitionFacilitates userSpeech recognitionCharacter recognitionHandwritingProcess systems
A hybrid approach to improve handwriting recognition and voice recognition in data process systems is disclosed. In one embodiment, a front end is used to recognize strokes, characters and / or phonemes. The front end returns candidates with relative or absolute probabilities of matching to the input. Based on linguistic characteristics of the language, e.g. alphabetical or ideographic language for the words being entered, e.g. frequency of words and phrases being used, likely part of speech of the word entered, the morphology of the language, or the context in which the word is entered), a back end combines the candidates determined by the front end from inputs for words to match with known words and the probabilities of the use of such words in the current context.
Owner:TEGIC COMM

Method and system for the automatic recognition of deceptive language

A system for identifying deception within a text includes a processor for receiving and processing a text file. The processor includes a deception indicator tag analyzer for inserting into the text file at least one deception indicator tag that identifies a potentially deceptive word or phrase within the text file, and an interpreter for interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive word or phrases within the text file and generating deception likelihood data based upon the density or distribution of potentially deceptive word or phrases within the text file. A method for identifying deception within a text includes the steps of receiving a first text to be analyzed, normalizing the first text to produce a normalized text, inserting into the normalized text at least one part-of-speech tag that identifies a part of speech of a word associated with the part-of-speech tag, inserting into the normalized text at least one syntactic label that identifies a linguistic construction of one or more words associated with the syntactic label, inserting into the normalized text at least one deception indicator tag that identifies a potentially deceptive word or phrase within the normalized text, interpreting the at least one deception indicator tag to determine a distribution of potentially deceptive word or phrases within the normalized text, and generating deception likelihood data based upon the density or frequency of distribution of potentially deceptive word or phrases within the normalized text.
Owner:DECEPTION DISCOVERY TECH

Domain specific natural language understanding of customer intent in self-help

Method and apparatus for providing a personalized self-support service to a user of an online application coupled with an online community forum. Embodiments include obtaining a plurality of questions from the online community forum and obtaining historical user data. Embodiments further include identifying one or more part-of-speech words in the plurality of questions and generating a high-dimensional vector for each question of the plurality of questions based on a frequency of the one or more part-of-speech words. Embodiments further include identifying one or more user features of the plurality of users based on the historical user data and establishing, based on the historical user data, one or more statistical correlations between user features and part-of-speech words. Embodiments further include training a predictive model based on the one or more statistical correlations. Embodiments further include using the predictive model to predict to provide one or more relevant questions to the user.
Owner:INTUIT INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products