Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

3994 results about "Data ingestion" patented technology

Data ingestion is the process of obtaining and importing data for immediate use or storage in a database. To ingest something is to "take something in or absorb something.". Data can be streamed in real time or ingested in batches. When data is ingested in real time, each data item is imported as it is emitted by the source.

Information Infrastructure Management Tools with Extractor, Secure Storage, Content Analysis and Classification and Method Therefor

The present invention is a method of organizing and processing data in a distributed computing system. The invention is also implemented as a computer program on a computer medium and as a distributed computer system. Software modules can be configured as hardware. The method and system organizes select content which is important to an enterprise operating said distributed computing system. The select content is represented by one or more predetermined words, characters, images, data elements or data objects. The computing system has a plurality of select content data stores for respective ones of a plurality of enterprise designated categorical filters which include content-based filters, contextual filters and taxonomic classification filters, all operatively coupled over a communications network. A data input is processed through at least one activated categorical filter to obtain select content, and contextually associated select content and taxonomically associated select content as aggregated select content. The aggregated select content is stored in the corresponding select content data store. A data process from the group of data processes including a copy process, a data extract process, a data archive process, a data distribution process and a data destruction process is associated with the activated categorical filter and the method and system applies the associated data process to a further data input based upon a result of that further data being processed by the activated categorical filter utilizing the aggregated select content data.
Owner:DIGITAL DOORS

Document management system with enhanced intelligent document recognition capabilities

InactiveUS20050289182A1Enhances document management qualityImprove efficiencyCharacter and pattern recognitionOffice automationXMLData extraction
An intelligent document recognition-based document management system includes modules for image capture, image enhancement, image identification, optical character recognition, data extraction and quality assurance. The system captures data from electronic documents as diverse as facsimile images, scanned images and images from document management systems. It processes these images and presents the data in, for example, a standard XML format. The document management system processes both structured document images (ones which have a standard format) and unstructured document images (ones which do not have a standard format). The system can extract images directly from a facsimile machine, a scanner or a document management system for processing.
Owner:SAND HILL SYST

Device, method and program for detecting unauthorized access

An unauthorized access detection device capable of detecting unauthorized accesses which are made through preparation, in real time. When a packet travels on a network, a key data extractor obtains the packet and obtains key data. Next an ongoing scenario detector searches an ongoing scenario storage unit for an ongoing scenario with the key data as search keys. A check unit determines whether the execution of the process indicated by the packet after the ongoing scenario detected by the ongoing scenario detector follows an unauthorized access scenario being stored in an unauthorized access scenario storage unit. Then a report output unit outputs an unauthorized access report depending on the check result of the check unit.
Owner:FUJITSU LTD

Estimating Social Interest in Time-based Media

Social media content items are mapped to relevant time-based media events. These mappings may be used as the basis for multiple applications, such as ranking of search results for time-based media, automatic recommendations for time-based media, prediction of audience interest for media purchasing / planning, and estimating social interest in the time-based media. Social interest in time-based media (e.g., video and audio streams and recordings) segments is estimated through a process of data ingestion and integration. The estimation process determines social interest in specific events represented as segments in time-based media, such as particular plays in a sporting event, scenes in a television show, or advertisements in an advertising block. The resulting estimates of social interest also can be graphically displayed.
Owner:BLUEFIN LABS

Method and system for extracting and classifying geolocation information utilizing electronic social media

Methods, systems and processor-readable media for extracting and classifying location information utilizing social media messages and / or data thereof. The social media messages can be sampled from a social media database and the messages filtered based on a heuristic rule. A geolocation entity from the unstructured social media messages can be extracted utilizing a geolocation entity extracting module. The messages with the geoentities can be uploaded onto a crowd sourcing platform to manually annotate the messages with a label. A text classification model can be built and learned from the label utilizing a machine learning algorithm and the messages can be classified by a location classifier in order to extract the user location. The user location can then be transformed into a geocode so that a spatial search can be enabled and the distance between the locations can be easily calculated.
Owner:XEROX CORP

Systems and methods for facilitating data discovery

A system for facilitating data discovery on a network, wherein the network has one or more data storage devices. The system may include a crawler program configured to select at least a first set of files and a second set of files, each of the first set of files and the second set of files being stored in at least one of the one or more data storage devices. The system may also include a data fetcher program configured to obtain a copy of the first set of files, the data fetcher program being further configured to resist against obtaining a copy of the second set of files. The system may also include circuit hardware implementing one or more functions of one or more of the crawler program and the data fetcher program.
Owner:EMC IP HLDG CO LLC

System and method for assessing TV-related information over the internet

The system retrieves information from the internet using multiple search engines that are simultaneously launched by the search engine commander. The commander is responsive to a speech-enabled system including a speech recognizer and natural language parser. The user speaks to the system in natural language requests, and the parser extracts the semantic content from the user's speech, based on a set of goal oriented grammars. The preferred system includes a fixed grammar and an updatable or downloaded grammar, allowing the system to be used without extensive training and yet capable of being customized for a particular user's purposes. Results obtained from the search engines are filtered based on information extracted from an electronic program guide and from prestored user profile data. The results may be displayed on screen or through synthesized speech.
Owner:INTERTRUST TECH CORP

Business methods and systems for providing healthcare management and decision support services using structured clinical information extracted from healthcare provider data

Business methods and systems that are provided use knowledge-based expert systems for mining (extracting) highly structured clinical information from various structured and unstructured sources of healthcare provider data. In one business method for on-line healthcare management and decision support, a service provider maintaining a collection of structured clinical data, the structured clinical data comprising information automatically mined from various structured and unstructured sources of healthcare provider data from one or more different healthcare providers. The service provider provides a customer on-line access to structured clinical data in the collection, or providing an on-line service to the customer using structured clinical data in the collection, based on a service agreement between the customer and the service provider.
Owner:SIEMENS MEDICAL SOLUTIONS USA INC

Displaying Estimated Social Interest in Time-based Media

Social media content items are mapped to relevant time-based media events. These mappings may be used as the basis for multiple applications, such as ranking of search results for time-based media, automatic recommendations for time-based media, prediction of audience interest for media purchasing / planning, and estimating social interest in the time-based media. Social interest in time-based media (e.g., video and audio streams and recordings) segments is estimated through a process of data ingestion and integration. The estimation process determines social interest in specific events represented as segments in time-based media, such as particular plays in a sporting event, scenes in a television show, or advertisements in an advertising block. The resulting estimates of social interest also can be graphically displayed.
Owner:BLUEFIN LABS

Method and system for preventing data leakage from a computer facilty

In embodiments of the present invention improved capabilities are described for the steps of identifying, through a monitoring module of a security software component, a data extraction behavior of a software application attempting to extract data from an endpoint computing facility; and in response to a finding that the data extraction behavior is related to extracting sensitive information and that the behavior is a suspicious behavior, causing the endpoint to perform a remedial action. The security software component may be a computer security software program, a sensitive information compliance software program, and the like.
Owner:SOPHOS

Method and system for secure cashless gaming

A secure cashless gaming system comprises a plurality of gaming devices which may or may not be connected to a central host network. Each gaming device includes an intelligent data device reader which is uniquely associated with a security module interposed between the intelligent data device reader and the gaming device processor. A portable data device bearing credits is used to allow players to play the various gaming devices. When a portable data device is presented to the gaming device, it is authenticated before a gaming session is allowed to begin. The intelligent data device reader in each gaming device monitors gaming transactions and stores the results for later readout in a secure format by a portable data extraction unit, or else for transfer to a central host network. Gaming transaction data may be aggregated by the portable data extraction unit from a number of different gaming devices, and may be transferred to a central accounting and processing system for tracking the number of remaining gaming credits for each portable data unit and / or player. Individual player habits can be monitored and tracked using the aggregated data. The intelligent data device reader may be programmed to automatically transfer gaming credits from a portable data device the gaming device, and continually refresh the credits each time they drop below a certain minimum level, thus alleviating the need for the player to manually enter an amount of gaming credits to transfer to the gaming device.
Owner:SMART CARD INTEGRATORS

Transcription data extraction

A computer program product, for performing data determination from medical record transcriptions, resides on a computer-readable medium and includes computer-readable instructions for causing a computer to obtain a medical transcription of a dictation, the dictation being from medical personnel and concerning a patient, analyze the transcription for an indicating phrase associated with a type of data desired to be determined from the transcription, the type of desired data being relevant to medical records, determine whether data indicated by text disposed proximately to the indicating phrase is of the desired type, and store an indication of the data if the data is of the desired type.
Owner:DELIVERHEALTH SOLUTIONS LLC +1

Systems and methods for automatically reducing data search space and improving data extraction accuracy using known constraints in a layout of extracted data elements

InactiveUS20110258195A1Reducing data search spaceImproving data extraction accuracyDigital data processing detailsSpecial data processing applicationsElectronic documentData ingestion
A method of automatically narrowing data search space and improving accuracy of data extraction using known constraints in a layout of extracted data elements for classified documented is provided. The method includes: analyzing each document to classify it within a document category, each category having a corresponding set of expected layouts; analyzing each electronic document to automatically extract images and text features; automatically constructing a data structure including a layout of the extracted features and layout relationships amongst the extracted features, wherein each of the extracted features in the layout maintains a reference to neighboring features and wherein closely related features are merged to form a combined feature; automatically narrowing data search space by detecting and removing parts of the layout that are not associated with any data elements using the data structure; and automatically detecting data using the extracted feature layout and the layout relationships amongst the extracted features.
Owner:GRUNTWORX

Scalable data extraction techniques for transforming electronic documents into queriable archives

A method for extracting an attribute occurrence from template generated semi-structured document comprising multi-attribute data records comprises identifying a first set of attribute occurrences in the template generated semi-structured document using an ontology. The method further comprises determining a boundary of each multi-attribute data record in the template generated semi-structured document, learning a pattern for an attribute corresponding to an identified attribute occurrence of the first set in the template generated semi-structured document, and applying the pattern within the boundary of each multi-attribute data record in the template generated semi-structured document to extract a second set of attribute occurrences.
Owner:THE RES FOUND OF STATE UNIV OF NEW YORK

System and method for data extraction and management in multi-relational ontology creation

InactiveUS20060053174A1Easy to controlEfficient and precise derivation and loading of relevant informationMetadata text retrievalComputer security arrangementsData ingestionKnowledge Field
The invention relates to a system and method for data extraction and management in multi-relational ontology creation. The system of the invention includes selecting a corpus of documents containing information relevant to a targeted knowledge domain, extracting assertions and their constituent concepts and relationships from the corpus, and storing the assertions, wherein the extraction processes may rules and utilize natural language processing.
Owner:BIOWISDOM

Displaying estimated social interest in time-based media

Social media content items are mapped to relevant time-based media events. These mappings may be used as the basis for multiple applications, such as ranking of search results for time-based media, automatic recommendations for time-based media, prediction of audience interest for media purchasing / planning, and estimating social interest in the time-based media. Social interest in time-based media (e.g., video and audio streams and recordings) segments is estimated through a process of data ingestion and integration. The estimation process determines social interest in specific events represented as segments in time-based media, such as particular plays in a sporting event, scenes in a television show, or advertisements in an advertising block. The resulting estimates of social interest also can be graphically displayed.
Owner:BLUEFIN LABS

Prediction method and visualization method of traffic jam

The invention discloses a prediction method and visualization method of traffic jam. The methods comprise the following steps: GPS data sent back by a taxicab is utilized for being associated to roads in an electronic map through the map matching method; the road section speed calculated in terms of the matched data is utilized for judging the traffic state of road sections; the evolution rule of the traffic jam, including formation and dissipation of the traffic jam, is extracted with the utilization of the historical data; the traffic jam prediction is performed with the utilization of the sliding time window scheme after a real-time traffic information database is associated; the congestion intensity and the influence range of congested road sections can be presented by a thermodynamic chart after the congestion intensity of the congested road sections is calculated based on the prediction result. The prediction method provided by the invention can realize highly accurate traffic jam prediction, and the visualization method of the thermodynamic chart enables the traffic jam to be visualized and understandable, so that the location and the influence range of the traffic jam can be discerned conveniently.
Owner:SUN YAT SEN UNIV +1

Methods and systems for monitoring transaction entity versions for policy compliance

A system for determining lack of compliance of a transactional entity with an enterprise policy by maintaining an historical record of the entity as changes are made over time. The system allows establishment, codification, and maintenance of enterprise policies, monitors electronic transactions of the enterprise from various data sources, detects exceptions to established policies, reports exceptions to authorized users such as managers and auditors, and / or provides a case management system for tracking exceptions and their underlying transactions. A master data extractor establishes an initial instance of a transactional entity in a monitoring database. A changed data extractor is responsive to changed data for establishing a subsequent instance of the transactional entity in the monitoring database. A transaction analysis engine applies predetermined policy rules to data in the monitoring database to determine lack of compliance of the initial and subsequent instances of the transactional entity with enterprise policies.
Owner:OVERSIGHT SYST INC

Method for abstracting network data and web reptile system

A web crawler system used for picking up webpage data is prepared as providing data pick-up task to the second component and receiving execution result of data pick-up task from the second component by the first component, communicating with webpage server to obtain webpage data and operating DOM model to pick up data as well as describing picked up data then sending picked up data and its description to the first component by the second one.
Owner:李沫南

Machine learning of document templates for data extraction

InactiveUS7561734B1Speed up template developmentAssures template qualityNatural language data processingSpecial data processing applicationsData ingestionGraphics
The present system can perform machine learning of prototypical descriptions of data elements for extraction from machine-readable documents. Document templates are created from sets of training documents that can be used to extract data from form documents, such as: fill-in forms used for taxes; flex-form documents having many variants, such as bills of lading or insurance notifications; and some context-form documents having a description or graphic indicator in proximity to a data element. In response to training documents, the system performs an inductive reasoning process to generalize a document template so that the location of data elements can be predicted for the training examples. The automatically generated document template can then be used to extract data elements from a wide variety of form documents.
Owner:LEIDOS

Method and system of data automatic conversion and storage

The invention discloses a system of data automatic conversion and storage. The system comprises a data extractor, a data converter, a data register and a data storage unit. The data extractor is used for extracting original data from different data sources and transmitting the original data to the data converter; the data converter is used for converting the original data from different data sources to specific data formats; the data register is connected with the data converter and is used for organizing the obtained data results from the data converter to uniform data structures; and the data storage unit is used for storing the data from the data register to a target database. According to a method and the system of the data automatic conversion and storage, different system information configuration is opened, conducting of system customization according to field needs is supported, the purpose of heterogeneous data integration of existing different business systems of enterprises is achieved, and data are integrated effectively.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI +1

Remote data collection systems and methods using read only data extraction and dynamic data handling

Remote data collection systems and methods retrieve data including financial, sales, marketing, operational and the like data from a plurality of databases and database types remotely over a network in an automated, platform-agnostic manner. A remote data collection system includes a network interface, a connection to a data source, a processor communicatively coupled to the network interface and the connection, and memory storing instructions for remote data collection that, when executed, cause the processor to: receive a request to extract data from the data source; extract the data in a non-intrusive manner from the data source using a two phase process comprising a reconciliation phase and a collection phase; and transmit one of an entire set and a subset of the extracted data based on the request.
Owner:ZEEWISE

Data extraction and testing method and system

The present method and apparatus provides for automated testing of data integration and business intelligence projects using Extract, Load and Validate (ELV) architecture. The method and computer program product provides a testing framework that automates the querying, extraction and loading of test data into a test result database from plurality of data sources and application interfaces using source specific adaptors. The test data available for extraction using the adaptors include metadata such as the database query generated by the OLAP Tools that are critical to validate the changes in business intelligence systems. A validation module helps define validation rules for verifying the test data loaded into the test result database. The validation module further provides a framework for comparing the test data with previously archived test data as well as benchmark test data.
Owner:YALAMANCHILLI NARENDAR

Portable data reading device with integrated web server for configuration and data extraction

A portable data reading device, such as a barcode scanner or RFID reader, includes a Web server and a first server-side application to modify one or more settings of the portable data reading device. The settings may include, for example, symbology settings, device settings, and network settings. The Web server may receive formatted data from a client browser representing a requested modification of at least one setting of the portable data reading device. Upon receiving the formatted data, the Web server may automatically invoke the first server-side application to modify the at least one setting responsive to the formatted data.
Owner:DATALOGIC MOBILE

Information infrastructure management tools with extractor, secure storage, content analysis and classification and method therefor

The present invention is a method of organizing and processing data in a distributed computing system. The invention is also implemented as a computer program on a computer medium and as a distributed computer system. Software modules can be configured as hardware. The method and system organizes select content which is important to an enterprise operating said distributed computing system. The select content is represented by one or more predetermined words, characters, images, data elements or data objects. The computing system has a plurality of select content data stores for respective ones of a plurality of enterprise designated categorical filters which include content-based filters, contextual filters and taxonomic classification filters, all operatively coupled over a communications network. A data input is processed through at least one activated categorical filter to obtain select content, and contextually associated select content and taxonomically associated select content as aggregated select content. The aggregated select content is stored in the corresponding select content data store. A data process from the group of data processes including a copy process, a data extract process, a data archive process, a data distribution process and a data destruction process is associated with the activated categorical filter and the method and system applies the associated data process to a further data input based upon a result of that further data being processed by the activated categorical filter utilizing the aggregated select content data.
Owner:DIGITAL DOORS

Apparatus, system and method incorporating virtualization for data storage

For long-term data preservation, a storage virtualization system contains a metadata extraction module, an indexing module, a search module, and a virtualization module. The system utilizes two types of virtual volumes: unmarked volumes and marked volumes. The metadata extraction module extracts metadata that describes the data stored in logical volumes located in external storage. The indexing module scans the data and creates an index, and the index and metadata are stored in a local storage. After metadata is extracted for all data in a volume, and all data in the volume are indexed, the virtual volume corresponding to that volume is marked and the volume is ready to be made inactive. The search module allows a user to search for desired data using the metadata and the index stored in the local storage instead having to access the external storage systems where the data is actually stored.
Owner:HITACHI LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products