Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

760results about "Unstructured textual data retrieval" patented technology

Implicit rating of retrieved information in an information search system

An information retrieval system allows a user to search a database of informational items for a desired informational item, and presents the search result in the form of matching index entries in the order of relevance. The information retrieval system in accordance with the principles of the present invention assigns a relevance rating to each of the index entries without requiring an explicit input from the user with respect to the usefulness or the relevance of the retrieved information corresponding to the respective index entries. When the user selects and retrieves an informational item through a list of index entries presented by the retrieval system, as a result of a search, the relevance rating of the selected informational item is increased by a predetermined amount. The relevance rating of the selected informational item is further adjusted based on any actions the user takes subsequent to the initial selection of the informational item if the subsequent act indicates that the relevance of the selected informational item may be less than what is reflected by the rating increase by the predetermined amount. Ratings of the informational items in the database are determined from implicit suggestions from the usage of the retrieval system and the database by the user rather than from an explicit user input. In another aspect of the present invention, the ratings are allowed to decay over time to minimize the tendencies for historical usage biased rating, and to provide more temporally accurate ratings. The most recently accessed time of each of the informational items in the database is compared to a predetermined stale access time threshold, and if the most recently accessed time is older than the threshold, than the rating of the corresponding informational item is decreased to reflect the dated nature of the information contained within the item.
Owner:ORACLE OTC SUBSIDIARY

Process and system for retrieval of documents using context-relevant semantic profiles

A process and system for database storage and retrieval are described along with methods for obtaining semantic profiles from a training text corpus, i.e., text of known relevance, a method for using the training to guide context-relevant document retrieval, and a method for limiting the range of documents that need to be searched after a query. A neural network is used to extract semantic profiles from text corpus. A new set of documents, such as world wide web pages obtained from the Internet, is then submitted for processing to the same neural network, which computes a semantic profile representation for these pages using the semantic relations learned from profiling the training documents. These semantic profiles are then organized into clusters in order to minimize the time required to answer a query. When a user queries the database, i.e., the set of documents, his or her query is similarly transformed into a semantic profile and compared with the semantic profiles of each cluster of documents. The query profile is then compared with each of the documents in that cluster. Documents with the closest weighted match to the query are returned as search results.
Owner:DTI OF WASHINGTON

Data aggregation server for managing a multi-dimensional database and database management system having data aggregation server integrated therein

Improved method of and apparatus for aggregating data elements in multidimensional databases (MDDB). In one aspect of the present invention, the apparatus is realized in the form of a high-performance stand-alone (i.e. external) aggregation server which can be plugged-into conventional OLAP systems to achieve significant improments in system performance. In accordance with the principles of the present invention, the stand-alone aggregation server contains a scalable MDDB and a high-performance aggregation engine that are integrated into the modular architecture of the aggregation server. The stand-alone aggregation server of the present invention can uniformly distribute data elements among a plurality of processors, for balanced loading and processing, and therefore is highly scalable. The stand-alone aggregation server of the present invention can be used to realize (i) an improved MDDB for supporting on-line analytical processing (OLAP) operations, (ii) an improved Internet URL Directory for supporting on-line information searching operations by Web-enabled client machines, as well as (iii) diverse types of MDDB-based systems for supporting real-time control of processes in response to complex states of information reflected in the MDDB. In another aspect of the present invention, the apparatus is integrated within a database management system (DBMS). The improved DBMS can be used to realize achieving a significant increase in system performance (e.g. deceased access / search time), user flexibility and ease of use. The improved DBMS system of the present invention can be used to realize an improved Data Warehouse for supporting on-line analytical processing (OLAP) operations or to realize an improved informational database system, operational database system, or the like.
Owner:YANICKLO TECH LIABILITY +1

Method for searching media

The present invention is directed to a computer-implemented method and apparatus for searching in response to Internet-based search queries using a search engine and an electronic database. According to one example embodiment of the present invention, data sets representing published items are input, for example, scanned-in or sent electronically, and stored in a searchable database. Each data set includes text from at least one published item. Responsive to the search query, a search engine searches for and identifies relevant web pages and data sets representing published items and, in a more specific embodiment, ranked characterizations are returned for the relevant web pages and published items. An electronic path can be provided with the published item for accessing further information about the published item. In one embodiment, the electronic path is a hyperlink from a characterization of a relevant published item to a more complete electronic representation of the relevant published item. Publishers provide authorization to display copyrighted materials through a permission protocol.
Owner:GOOGLE LLC

Probabilistic information retrieval based on differential latent semantic space

A computer-based information search and retrieval system and method for retrieving textual digital objects that makes full use of the projections of the documents onto both the reduced document space characterized by the singular value decomposition-based latent semantic structure and its orthogonal space. The resulting system and method has increased robustness, improving the instability of the traditional keyword search engine due to synonymy and / or polysemy of a natural language, and therefore is particularly suitable for web document searching over a distributed computer network such as the Internet.
Owner:SUNFLARE CO LTD

Method for standardizing phrasing in a document

A method for standardizing phrases in a document includes the steps of identifying phrases of a document to create a preliminary list of standard phrases; filtering the preliminary list of standard phrases to create a final list of standard phrases; identifying candidate phrases of the document which are similar to the standard phrases; confirming whether a candidate phrase of the document is sufficiently proximate to the standard phrase to constitute an approximate phrase; and computing a phrase substitution to determine the appropriate conformation of standard phrase to the approximate phrase or the approximate phrase to the standard. Further this invention relates to a computer system for standardizing a document.
Owner:THOMSON REUTERS ENTERPRISE CENT GMBH

Content-based user interface for document management

A content-based method of managing a collection of documents is disclosed. A user interface is provided for managing the collection of documents. For each document, at least one information object representative of conceptual content of a portion of the document is identified. The information objects are combined with additional conceptual information inferred from the user interface to determine a network of conceptual relationships associated with the collection of documents. The user interface provides user access to the network of conceptual relationships to manage the collection of documents.
Owner:FELDMAN DAVID

Ontology for database design and application development

A system and method lets a user create or import ontologies and create databases and related application software. These databases can be specially tuned to suit a particular need, and each comes with the same error-detection rules to keep the data clean. Such databases may be searched based on meaning, rather than on words-that-begin-with-something. And multiple databases, if generated from the same basic ontology can communicate with each other without any additional effort. Ontology management and generation tools enable enterprises to create databases that use ontologies to improve data integration, maintainability, quality, and flexibility. Only the relevant aspects of the ontology are targeted, extracting out a sub-model that has the power of the full ontology restricted to objects of interest for the application domain. To increase performance and add desired database characteristics, this sub-model is translated into a database system. Java-based object-oriented and relational application program interfaces (APIs) are then generated from this translation, providing application developers with an API that exactly reflects the entity types and relations (classes and methods) that are represented by the database. This generation approach essentially turns the ontology into a set of integrated and efficient databases.
Owner:KYNDI

Method and apparatus for indexing and searching content in hardcopy documents

A method and apparatus for indexing and searching content in a hardcopy document utilizes a searching assistant computing device (402) with an index table (420) stored in memory (412). The index table (420) is created in memory by scanning a 2-D barcode from a hardcopy document or alternatively by downloading indexing information from a web page via the Internet (430). A search engine (410) in the searching assistant (402) searches the index table (420) to locate a data element found in the content of the hardcopy document. The indexing information corresponding to the data element is displayed to a user as part of the search results to indicate the location of the data element in the hardcopy document.
Owner:IBM CORP

Method for organizing directories

In a database file management system for accessing data records that correspond to items in a directory. The directory items are linked to a trie index that is arranged in blocks and being stored in a storage medium. The trie index enables accessing or updating the directory items data records by key or keys and being susceptible to an unbalanced structure of blocks. There is provided a method for constructing a layered index arranged in blocks, which includes the steps of providing the trie index and constructing a representative index over the representative keys of the trie index. The layered index enables accessing or updating the directory items by key or keys and it constitutes a balanced structure of blocks.
Owner:DB SOFTWARE INC

Obtaining data from unstructured data for a structured data collection

Techniques for obtaining data from unstructured data for a structured data collection include receiving unstructured data that includes text; identifying an attribute associated with a structured data collection; obtaining at least one of historical data associated with the attribute or additional data associated with a user of the computing system; identifying one or more terms from the unstructured data as being associated with the attribute based on at least one of the historical data or the additional data; and storing the identified one or more terms in a data record of the unstructured data collection.
Owner:BUSINESS OBJECTS SOFTWARE

Method and system for selectively presenting database results in an information retrieval system

An information retrieval system is described that dynamically prioritizes search request results prior to output to a user. When a database search yields multiple hits, the results are first categorized into a series of groups. Categories are determined from any number of different factors, such as geographical locations of the search results, amenities, hours of operation, etc. For each category, the search results can be parsed into groups within the category. The results are first reported to the user in general terms, as a number of search results in each of the groups. The user is prompted to select the group that is of most interest, and the portion of the individual search results that are within the selected group are reported.
Owner:BELLSOUTH INTPROP COR

Methods and apparatus for using a modified index to provide search results in response to an ambiguous search query

A system allows a user to submit an ambiguous search query and to receive potentially disambiguated search results. In one implementation, a search engine's conventional alphanumeric index is translated into a second index that is ambiguated in the same manner as which the user's input is ambiguated. The user's ambiguous search query is compared to this ambiguated index, and the corresponding documents are provided to the user as search results.
Owner:GOOGLE LLC

Electronic content search and delivery based on cursor location

InactiveUS7100123B1Faster and reliable word recognitionFaster and reliable recognitionUnstructured textual data retrievalComparison of digital valuesOperational systemData storing
An electronic search is automatically initiated when a cursor hovers in one location for a predetermined time. A target process associated with a target window is forced to re-render data to the target window in an update region that includes the detected cursor location. From the re-rendered data, a primary word and context words near the cursor location are determined. One or more local or remote electronic data stores are searched for substantive content related to the words. The content is prioritized according to user preference and displayed in a semitransparent window that is persistently visible to a user, yet does not obscure other content in an underlying window and does not shift the focus from an active window. Re-rendering is accomplished by invalidating an update region of the target window, and forcing the operating system to issue a paint message, causing the target process to redraw the update region.
Owner:MICROSOFT TECH LICENSING LLC

Heterogeneous multi-level extendable indexing for general purpose annotation systems

Methods, systems, and articles of manufacture for indexing annotations made for a variety of different type (i.e., heterogeneous) data objects are provided. A set of parameters uniquely identifying an annotated data object may be converted to an index comprising a set of index values, each corresponding to a column in a homogeneous index table. In order to accommodate the indexing of heterogeneous data objects, a mapping may be provided for each different type (or classification) of data object that may be annotated, that defines how the identifying parameters of that type will be mapped to the columns of the homogeneous index table.
Owner:IBM CORP

Method of constructing and displaying an entity profile constructed utilizing input from entities other than the owner

A method of constructing a profile comprising terms indicative of a characteristic of an entity commences when first electronic mail address, associated with a first entity, is created within a knowledge management system. The electronic mail address may be created automatically upon submission of an electronic mail document, or may be created manually by a systems administrator. A first electronic document is received via an electronic communications network at the first electronic mail address from a second entity, typically a user of the knowledge management system who is a registered and interactive user. The first electronic document is then parsed to identify profile terms therein. These profile terms are included within a first profile for the first entity. In this way, users of a knowledge management system may construct a profile of an entity (e.g., a customer) that is not a user of or participant within the knowledge management system.
Owner:ORACLE INT CORP

Information management and retrieval

A method and apparatus is provided for extracting key terms from a data set, the method includes identifying a first set of one or more word groups of one or more word that occur more than once in the data set, and removing from this first set a second set of word groups that are sub-strings of longer word groups in the first set. The remaining word groups are key terms. Each word group is weighted according to its frequency of occurrence within the data set. The weighting of any word group may be increased by the frequency of any sub-string of words occurring in the second set and then dividing each weighting by the number of words in the word group. This weighting process operates to determine the order of occurrence of the word groups. Prefixes and suffixes are also removed from each word in the data set. This produces a neutral form of each word so that the weighting values are prefix and suffix independent.
Owner:BRITISH TELECOMM PLC

Method and system for searching recorded speech and retrieving relevant segments

A system and method for searching recorded speech is disclosed. The system and method comprises converting the recorded speech into text using a voice recognition system. As the speech is being converted, naturally occurring breaks in the languages will be used to take time indexes from the recording. The system and method includes creating a full text index of the recorded speech utilizing an information extender. The full text index contains a plurality of time stamps that point to the occurrence of words in the recorded speech. Finally, the text is searched by a full text search server that has linguistic search capabilities using the full text index. Finally, the searched text, the text index and the recorded speech are stored in the database. The recorded speech is searched by locating relevant phrases or words, and then mapping the time stamps associated with the relevant phrases words back to the recorded speech in the database.
Owner:UNILOC 2017 LLC

Method and apparatus for automatic file clustering into a data-driven, user-specific taxonomy

An automatic file clustering algorithm enables documents within a file system to be displayed in a semantic view. The file clustering algorithm maps all words and documents into an appropriate semantic vector space, clusters the documents at a predetermined level of granularity, and assigns a meaningful descriptor to each resulting cluster. The documents are displayed to the user in a hierarchy in accordance with the resulting clusters. This results in a virtual file system with a semantic organization, that allows the user to navigate by content.
Owner:APPLE INC

Method and apparatus for providing media-independent self-help modules within a multimedia communication-center customer interface

In a multimedia call center (MMCC) operating through an operating system, a client-specific self-help wizard is provided for active clients and updated periodically with information related to client transaction history with the MMCC. A connected client is presented by the wizard with a selective media function through which the client may a select a media type for interaction and help, and the MMCC will then re-contact the client through the selected media. The client, for example, may select IP or COST telephony, and the MMCC will place a call to the client to a number or IP address listed for the client, and interactivity will then be through an interactive voice response unit. Help information specific to a client is updated in the client's wizard periodically according to ongoing transaction history with the MMCC. The wizard may also monitor client activity with the wizard and make reports available to various persons.
Owner:GENESYS TELECOMMUNICATIONS LABORATORIES INC +1

Extended functionality for an inverse inference engine based web search

An extension of an inverse inference search engine is disclosed which provides cross language document retrieval, in which the information matrix used as input to the inverse inference engine is organized into rows of blocks corresponding to languages within a predetermined set of natural languages. The information matrix is further organized into two column-wise partitions. The first partition consists of blocks of entries representing fully translated documents, while the second partition is a matrix of blocks of entries representing documents for which translations are not available in all of the predetermined languages. Further in the second partition, entries in blocks outside the main diagonal of blocks are zero. Another disclosed extension to the inverse inference retrieval document retrieval system supports automatic, knowledge based training. This approach applies the idea of using a training set to the problem of searching databases where information that is diluted or not reliable enough to allow the creation of robust semantic links. To address this situation, the disclosed system loads the left-hand partition of the input matrix for the inverse inference engine with information from reliable sources.
Owner:FIVER LLC

Relational text index creation and searching

In an environment where it is desire to perform information extraction over a large quantity of textual data, methods, tools and structures are provided for building a relational text index from the textual data and performing searches using the relational text index.
Owner:ATTENSITY CORP

Extrinsically influenced near-optimal path apparatus and method

InactiveUS6067572ARapidly and automatically determineRapidly and automatically designateError preventionFrequency-division multiplex detailsWavefrontOperational system
A method and apparatus for dynamically providing a path through a network of nodes or granules may use a limited, advanced look at potential steps along a plurality of available paths. Given an initial position, at an initial node or granule within a network, and some destination node or granule in the network, all nodes or granules may be represented in a connected graph. An apparatus and method may evaluate current potential paths, or edges between nodes still considered to lie in potential paths, according to some cost or distance function associated therewith. In evaluating potential paths or edges, the apparatus and method may consider extrinsic data which influences the cost or distance function for a path or edge. Each next edge may lie ahead across the advancing "partial" wavefront, toward a new candidate node being considered for the path. With each advancement of the wavefront, one or more potential paths, previously considered, may be dropped from consideration. Thus, a "partial" wavefront, limited in size (number of nodes and connecting edges) continues to evaluate some number of the best paths "so far." The method deletes worst paths, backs out of cul-de-sacs, and penalizes turning around. The method and apparatus may be implemented to manage a computer network, a computer internetwork, parallel processors, parallel processes in a multi-processing operating system, a smart scissor for a drawing application, and other systems of nodes.
Owner:ORACLE INT CORP

Automated creation and delivery of database content

A method and apparatus automatically build a database by automatically assigning links to an expert, pushing content to an expert, providing expert annotation, and linking the content to an annotation database. A term is selected by applying rules, such as, the term not previously existing in the database, an unusually high frequency of the term, the term is an article or the term is an unusual part of speech. An advertiser can sponsor the term, for example, by having a banner ad automatically pop-up on a keyword search. Content windows can be attached to the term, the content window containing information such as definitions, related products or services, sponsorship information, information from content syndicators, translations and reference works.
Owner:SENTIUS INT

Managing objects and sharing information among communities

A method for managing objects for users including providing a set of attributes and a set of containers each having attributes from the set. The method further provides a user interface for dynamically assigning attributes to the objects. The method further provides for selectively displaying, through a user interface, containers and objects in the containers. An object is displayed in a container if a condition is met. The condition is applied to the attributes of the container and the attributes of the object.
Owner:QUADRANT EPP +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products