Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

409 results about "Document retrieval" patented technology

Document retrieval is defined as the matching of some stated user query against a set of free-text records. These records could be any type of mainly unstructured text, such as newspaper articles, real estate records or paragraphs in a manual. User queries can range from multi-sentence full descriptions of an information need to a few words.

Process and system for retrieval of documents using context-relevant semantic profiles

A process and system for database storage and retrieval are described along with methods for obtaining semantic profiles from a training text corpus, i.e., text of known relevance, a method for using the training to guide context-relevant document retrieval, and a method for limiting the range of documents that need to be searched after a query. A neural network is used to extract semantic profiles from text corpus. A new set of documents, such as world wide web pages obtained from the Internet, is then submitted for processing to the same neural network, which computes a semantic profile representation for these pages using the semantic relations learned from profiling the training documents. These semantic profiles are then organized into clusters in order to minimize the time required to answer a query. When a user queries the database, i.e., the set of documents, his or her query is similarly transformed into a semantic profile and compared with the semantic profiles of each cluster of documents. The query profile is then compared with each of the documents in that cluster. Documents with the closest weighted match to the query are returned as search results.
Owner:DTI OF WASHINGTON

Method for presenting search results

Methods and systems are provided to present the search results in response to a search query that is submitted to a document retrieval system, such as a search engine. The search results are presented with a second-retrieval model that constructs multiple derived queries for the search query with a first small-document retrieval process, and then generates and outputs the results based on the retrieval of search results of at least part of the derived queries. One embodiment of the invention provides a method for grouping the search results, which presents ranked derived queries together with their search results to the user, in such a way that derived queries with higher ranks and top-ranked documents of each derived query are preferentially presented, and the grouped results are displayed and navigated in independent framed subareas of an output window. A further embodiment selects the search results from multiple result lists of the derived queries to form the final search results for the user query, wherein the merged results are re-ranked according to pre-determined criteria. The method can also be integrated with the local keyword associated clustering method by rank value adjustment, or result filtering or merging to achieve better technical effects.
Owner:SWEN BING

Category based, extensible and interactive system for document retrieval

In information retrieval (IR) systems with high-speed access, especially to search engines applied to the Internet and / or corporate intranet domains for retrieving accessible documents automatic text categorization techniques are used to support the presentation of search query results within high-speed network environments. An integrated, automatic and open information retrieval system (100) comprises an hybrid method based on linguistic and mathematical approaches for an automatic text categorization. It solves the problems of conventional systems by combining an automatic content recognition technique with a self-learning hierarchical scheme of indexed categories. In response to a word submitted by a requester, said system (100) retrieves documents containing that word, analyzes the documents to determine their word-pair patterns, matches the document patterns to database patterns that are related to topics, and thereby assigns topics to each document. If the retrieved documents are assigned to more than one topic, a list of the document topics is presented to the requester, and the requester designates the relevant topics. The requester is then granted access only to documents assigned to relevant topics. A knowledge database (1408) linking search terms to documents and documents to topics is established and maintained to speed future searches. Additionally, new strategies are presented to deal with different update frequencies of changed Web sites.
Owner:COGISUM INTERMEDIA

Image-based document indexing and retrieval

A system that facilitates document retrieval and / or indexing is provided. A component receives an image of a document, and a search component searches data store(s) for a match to the document image. The match is performed over word-level topological properties of images of documents stored in the data store(s).
Owner:ZHIGU HLDG

Speech translation apparatus and computer program product

A translation direction specifying unit specifies a first language and a second language. A speech recognizing unit recognizes a speech signal of the first language and outputs a first language character string. A first translating unit translates the first language character string into a second language character string that will be displayed on a display device. A keyword extracting unit extracts a keyword for a document retrieval from the first language character string or the second language character string, with which a document retrieving unit performs a document retrieval. A second translating unit translates a retrieved document into its opponent language, which will be displayed on the display device.
Owner:KK TOSHIBA

Document retrieval system with access control

An electonic document retrieval system and method for a collection of information distributed over a network having documents stored in web or document servers in which an access control list relates user identification to documents to which a user has access. No access control lists are contained in the documents themselves nor are comparisons made between lists of users, with their access levels, and the classifications of documents. Rather, by the use of URLs or pointers, it is possible to associate every document to which a user has access with the user identification number or code. URLs have a hierchical format which allows partial URLs to indicate levels of access. HTTP protocol, FTP and CGI protocol employ URL calls for documents and can use the access control method and system of the present invention. When a search query is applied to a query server, a list of hits is returned, together with pertinent URLs. The query server consults each access control list associated with each document server, to present to the user only those URLs for which he has a proper access level. Other URLs for which the user does not have proper access are kept hidden from the user.
Owner:GOOGLE LLC

Document retrieval using internal dictionary-hierarchies to adjust per-subject match results

Techniques for managing big data include retrieval using per-subject dictionaries having multiple levels of sub-classification hierarchy within the subject. Entries may include subject-determining-power (SDP) scores that provide an indication of the descriptive power of the entry term with respect to the subject of the dictionary containing the term. The same term may have entries in multiple dictionaries with different SDP scores in each of the dictionaries. A retrieval request for one or more documents containing search terms descriptive of the one or more documents can be processed by identifying a set of candidate documents tagged with subjects, i.e., identifiers of per-subject dictionaries having entries corresponding to a search term, then using affinity values to adjust the aggregate score for the terms in the dictionaries. Documents are then selected for best match to the subject based on the adjusted scores. Alternatively, the adjustment may be performed after selecting the documents by re-ordering them according to adjusted scores.
Owner:IBM CORP

Document conversion system including data monitoring means that adds tag information to hyperlink information and translates a document when such tag information is included in a document retrieval request

Data monitoring means adds tag information representing a document conversion method to hyperlink information included in a hypertext type document sent to a user-side device from a server-side application, and sends to the user-side application the hyperlink information to which the tag information is added. When the tag information representing the document conversion method is included in a command to request retrieval which is sent from the user-side application, the data monitoring means converts the document sent from the server-side application as the results of the retrieval in accordance with the document conversion method represented by the tag information and sends the converted document to the user-side application.
Owner:OATH INC +1

Document retrieval method and system and computer readable storage medium

A document retrieval system is provided which has a document display interface which is easy to recognize the important portions even if a document retrieved by using a query expression designated by a document or a long sentence is displayed. When a text is registered, predetermined character strings and location information which are extracted from the text are stored in a location information file. A weight of each character string is calculated by a predetermined method and is stored in a weight file. In retrieving a document, predetermined character strings are extracted from a designated query expression. A similarity is calculated between the query expression and texts in the database by using the location information and the weights acquired from the location file and the weight file. In displaying the document, character strings having the high weights are extracted from the character strings used for the retrieval. Then, the display format of a portion which contains the extracted character strings is changed to display the text.
Owner:HITACHI LTD

Extended functionality for an inverse inference engine based web search

An extension of an inverse inference search engine is disclosed which provides cross language document retrieval, in which the information matrix used as input to the inverse inference engine is organized into rows of blocks corresponding to languages within a predetermined set of natural languages. The information matrix is further organized into two column-wise partitions. The first partition consists of blocks of entries representing fully translated documents, while the second partition is a matrix of blocks of entries representing documents for which translations are not available in all of the predetermined languages. Further in the second partition, entries in blocks outside the main diagonal of blocks are zero. Another disclosed extension to the inverse inference retrieval document retrieval system supports automatic, knowledge based training. This approach applies the idea of using a training set to the problem of searching databases where information that is diluted or not reliable enough to allow the creation of robust semantic links. To address this situation, the disclosed system loads the left-hand partition of the input matrix for the inverse inference engine with information from reliable sources.
Owner:FIVER LLC

System and method for information retrieval employing a preloading procedure

A document retrieval system having improved response time. During the time the user spends viewing the displayed information, other information that the user is likely to read or study later is preloaded into memory. If the user later requests the preloaded information, it can be written to the display very quickly. As a result, the user's request to view new information can be serviced quickly.
Owner:GOOGLE LLC

Methods of storing and retrieving information, and methods of document retrieval

In one aspect, the invention encompasses a method of information storage and retrieval. A first communication is stored as data in a database with an identifier code. At least a portion of the data corresponding to the first communication is sent to a printer which prints a portion of the first communication together with the identifier code on a substrate. The first communication printed on the substrate is changed to form a second communication which is different from the first communication. The second communication is scanned with a scanning machine which digitizes the second communication and also digitizes the identifier code that had been printed on the substrate. Information is extracted from the digitized identifier code with a processor. The processor is in data communication with the database and is configured to utilize the extracted information to retrieve the first communication from the database. The digitized second communication is compared with the data of the first communication to identify differences between the second communication and the first communication.
Owner:HEWLETT PACKARD DEV CO LP

Question-answering system and question-answering processing method

A question sentence input part of question-answering system inputs a question sentence presented in a natural language. A document retrieval part of the system extracts a keyword from the question sentence and retrieves and extracts the document data including the keyword from a document database. An answer candidate extracting part of the system extracts a language presentation possibly becoming the answer as an answer candidate from the retrieved and extracted document data. An answer type determination part of the system determines an answer type of the answer candidate. An answer table output part of the system classifies the answer candidates by answer type and outputs an answer table listing all or part of the answer candidates having a predetermined evaluation or greater for each answer type in a table format.
Owner:NAT INST OF INFORMATION & COMM TECH

Extended functionality for an inverse inference engine based web search

An extension of an inverse inference search engine is disclosed which provides cross language document retrieval, in which the information matrix used as input to the inverse inference engine is organized into rows of blocks corresponding to languages within a predetermined set of natural languages. The information matrix is further organized into two column-wise partitions. The first partition consists of blocks of entries representing fully translated documents, while the second partition is a matrix of blocks of entries representing documents for which translations are not available in all of the predetermined languages. Further in the second partition, entries in blocks outside the main diagonal of blocks are zero. Another disclosed extension to the inverse inference retrieval document retrieval system supports automatic, knowledge based training. This approach applies the idea of using a training set to the problem of searching databases where information that is diluted or not reliable enough to allow the creation of robust semantic links. To address this situation, the disclosed system loads the left-hand partition of the input matrix for the inverse inference engine with information from reliable sources.
Owner:FIVER LLC

Method and system for selecting documents by measuring document quality

A system and method for document filtering and selection based on quality automatically operates to make value judgments for document retrieval. Items of data, e.g. documents, are automatically associated a value. Items of data may be then selected based upon value, which is not only for the specific subject or topic requested, but also desirable according to certain criteria, including each document's quality. A specific application of the invention is to a filter for computerized bulletin boards. Many of these systems, also known as discussion groups, have thousands of new messages per day. Readers and human editors do not have time to classify new messages by quality quickly. Messages may be ranked by quality automatically, to perform the same function performed by a human editor or moderator. Values and qualities may be assigned by interestingness, appropriateness, timeliness, humor, style of language, obscenity, sentiment, and any combinations thereof, for example.
Owner:RGT UNIV OF CALIFORNIA

Category based, extensible and interactive system for document retrieval

An integrated, automatic and open information retrieval system comprises an hybrid method based on linguistic and mathematical approaches for an automatic text categorization. It solves the problems of conventional systems by combining an automatic content recognition technique with a self-learning hierarchical scheme of indexed categories. In response to a word submitted by a requestor, said system retrieves documents containing that word, analyzes the documents to determine their word-pair patterns, matches the document patterns to database patterns that are related to topics, and thereby assigns topics to each document. If the retrieved documents are assigned to more than one topic, a list of the document topics is presented to the requestor, and the requestor designates the relevant topics. The requestor is then granted access only to documents assigned to relevant topics. A knowledge database linking search terms to documents and documents to topics is established and maintained to speed future searches. Additionally, new strategies are presented to deal with different update frequencies of changed Web sites.
Owner:COGISUM INTERMEDIA

Multilayer index voice document searching method and system thereof

The invention discloses a multilayer indexing voice document retrieval method and a system thereof, and belongs to the technical field of information retrieval. The multilayer indexing voice document retrieval method comprises the following steps: (1) feature extraction of a multimedia stream is implemented, thus obtaining a voice feature sequence; (2) a voice identifying decoder is used for searching the voice feature sequences, thus obtaining a word lattice and an optimal identification result; (3) according to the word lattice and the optimal identification result, a word and syllable double-layer indexing database is constructed; and (4) relevant documents of a given query term are searched in the indexing database and returned to users. The multilayer indexing voice document retrieval system comprises an automatic voice identifying module that is used for automatically identifying characters in voice documents; an automatic voice document index constructing module that is used for constructing double indexes of the voice identification result, and a voice document retrieval module that is used for searching the relevant documents of given query terms in the indexing database and returning the documents to users. Compared with the prior art, the multilayer indexing voice document retrieval method and the system can realize quick and accurate searching of multimedia data.
Owner:PEKING UNIV

Method and apparatus using discriminative training in natural language call routing and document retrieval

A method and apparatus for performing discriminative training of, for example, call routing training data (or, alternatively, other classification training data) which improves the subsequent classification of a user's natural language based requests. An initial scoring matrix is generated based on the training data and then the scoring matrix is adjusted so as to improve the discrimination between competing classes (e.g., destinations). In accordance with one illustrative embodiment of the present invention a Generalized Probabilistic Descent (GPD) algorithm may be advantageously employed to provide the improved discrimination. More specifically, the present invention provides a method and apparatus comprising steps or means for generating an initial scoring matrix comprising a numerical value for each of a set of n classes in association with each of a set of m features, the initial scoring matrix based on a set of training data and, for each element of said set of training data, based on a subset of said features which are comprised in the natural language text of said element of said set of training data and on one of said classes which has been identified therefor; and based on the initial scoring matrix and the set of training data, generating a discriminatively trained scoring matrix for use by said classification system by adjusting one or more of said numerical values such that a greater degree of discrimination exists between competing ones of said classes when said classification requests are performed, thereby resulting in a reduced classification error rate.
Owner:LUCENT TECH INC

Differential LSI space-based probabilistic document classifier

A computerized method for automatic document classification based on a combined use of the projection and the distance of the differential document vectors to the differential latent semantics index (DLSI) spaces. The method includes the setting up of a DLSI space-based classifier to be stored in computer storage and the use of such classifier by a computer to evaluate the possibility of a document belonging to a given cluster using a posteriori probability function and to classify the document in the cluster. The classifier is effective in operating on very large numbers of documents such as with document retrieval systems over a distributed computer network.
Owner:SUNFLARE CO LTD

Database and index organization for enhanced document retrieval

A customized, specialty-oriented database and index of a subject matter area and methods for constructing and using such a database are provided. Selection and indexing of articles is done by experts in the topic with which the database is concerned. As a result, articles are indexed in a manner that allows facile, rapid retrieval of highly relevant articles with few or no false positives with much reduced database maintenance cost through frugal limitation of number of documents in the database, number of terms in a Master Index, and number of codes assigned to each document. A thesaurus allows indexing and search in accordance with terminology familiar to different anticipated groups of users (e.g. doctors, patients, nurses, technicians, and the like). Key articles collections and rapid access to documents therein are also provided.
Owner:NELSON INFORMATION SYST

Document management techniques to account for user-specific patterns in document metadata

Document management techniques to account for user-specific patterns in document metadata are disclosed. In one embodiment, a method for facilitating document retrieval may comprise: assigning a first entitlement to a first user for accessing a first plurality of documents; identifying patterns in the first user's creation or modification of metadata related to the first plurality of documents; recording the identified patterns associated with the first user; receiving a document query from a second user who has been assigned a second entitlement to access a second plurality of documents; determining, based on the second entitlement, an access right of the second user with respect to the first plurality of documents; and modifying the document query based on the access right of the second user and the identified patterns, such that the document query returns relevant documents from the first plurality of documents despite the second user's ignorance of the identified patterns.
Owner:JPMORGAN CHASE BANK NA

Apparatus and method for document retrieval

Document retrieval system and method are disclosed which can diminish a gap between the user's retrieval intention in information retrieval and the configuration of a query as well as document representations in database and which permits easy retrieval reflecting the user's retrieval intention. The user enumerates a group of words which the user hits upon, as a primary query. Upon receipt of the primary query, the system estimates relational representations which the words (group) of the primary query can possess, and then makes expansion of the query through a partial coincidence of the relational representations and sample spaces extracted from document data to prepare a query candidate representation group. The expanded query candidate representation group is presented to the user. The user then simply chooses a relational representation candidate in accordance with his or her intention. A retrieval execution query is constituted by the thus-selected representation.
Owner:FUJIFILM BUSINESS INNOVATION CORP

Graph-based judgment document case similarity calculation and retrieval method and system

The invention discloses a graph-based judgment document case similarity calculation and retrieval method. The method includes the following main steps that 1, judgment documents are collected; 2, thereasoning parts of the judgment documents are identified; 3, the case elements of the reasoning parts are analyzed; 4, case reason atlases are generated; 5, a client receives the retrieval information; 6, the case elements are extracted or mapped; 7, the reason atlases of the case elements are matched and calculated; 8, cases similar to retrieval content are returned. Meanwhile, the invention further discloses a graph-based judgment document case similarity calculation and retrieval system. The system is characterized by including a judgment document case similarity calculation device and a similar case document retrieval device. The method and the system fully consider the professional knowledge of the judgment documents, and the generated case reason atlases show the most critical elements and internal logical relations of the cases in a compressed but intuitive mode; therefore, the method and the system provide convenience for relevant persons to intuitively look over the main points of the cases, and meanwhile enable the relevant persons to accurately retrieve the relevant cases from a document library.
Owner:江西思贤数据科技有限公司

Ciphertext cloud-storage oriented document retrieval method and system

The invention discloses a ciphertext cloud-storage oriented document retrieval method and a ciphertext cloud-storage oriented document retrieval system, belonging to the technical field of information security. The method comprises the following steps: 1) a client generates an index key for a user by a master key imported by the user, and encrypts the index key by the master key, then stores the index key into a server; 2) after receiving the attribute metainformation of a document to be inquired input by a certain user, the client acquires a ciphertext of the index key from the server, then decrypts the ciphertext so as to obtain a decrypted index key; 3) the client encrypts the attribute metainformation of the document to be inquired by the decrypted index key, then sends the encrypted attribute metainformation to the server; and 4) the server carries out ciphertext retrieval on the index table according to the attribute metainformation, returning the retrieval records meeting the conditions to the client so as to obtain documents corresponding to the attribute metainformation. The system comprises a server and a plurality of clients, and the clients are respectively connected with the server through the Internet. The method and the system disclosed in the invention have the advantages that the security and retrieval efficiency of the ciphertext retrieval system are improved, and the expansibility is high.
Owner:INST OF SOFTWARE - CHINESE ACAD OF SCI

Recommendatory information provision system

A recommendatory information provision system includes: a management apparatus; and a user terminal device, wherein: the management apparatus includes: a browsed information acquisition unit for acquiring information of a browsed document; a recommended document retrieval unit for retrieving a recommended document relevant to the browsed document; a recommendation history management unit for storing and managing history information which concerns recommendation of the recommended document; a recommendation history retrieval unit for retrieving from the recommendation history management unit, the history information of the recommended document which has been retrieved by the recommended document retrieval unit; and a recommendatory information provision unit for providing to the user terminal device, the information concerning the recommended document and the history information of the recommended document as have been retrieved; and the user terminal device includes: a display process unit for displaying on a screen, the information.
Owner:FUJIFILM BUSINESS INNOVATION CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products