Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

98 results about "Extract, transform, load" patented technology

In computing, extract, transform, load (ETL) is the general procedure of copying data from one or more sources into a destination system which represents the data differently from the source(s) or in a different context than the source(s). The ETL process became a popular concept in the 1970s and is often used in data warehousing.

Method and system for developing extract transform load systems for data warehouses

InactiveUS7139779B1Minimal and reduced cost and timeMinimal and reduced and design timeData processing applicationsMulti-dimensional databasesFlow process chartData flow diagram
Developing an ETL system for transforming data prior to loading the data into a data warehouse. An embodiment of the invention automatically generates configuration files from an input data flow diagram defining the ETL system. The configuration files or other metafiles control execution of the processes illustrated in the data flow diagram. The invention includes a notation for use in the data flow diagram.
Owner:MICROSOFT TECH LICENSING LLC

Virtual data warehousing

A virtual data warehouse (the functional equivalent of a conventional data warehouse) that provides aggregated views of the complete data inventory. The virtual data warehouse contains metadata, which is used to form a logical enterprise data model that is part of the database of record (DBOR) infrastructure. Each legacy back-end database system is published on the infrastructure, with its metadata extracted and used as noted above. The infrastructure software uses standard J2EE, JMS and reusable EJBs, for transactional unit requests, and ETL (extract-transform-load) tools for real-time bulk loading of data.
Owner:TRANSPACIFIC DELTA SCI

Data synchronism ETL (Extract Transform Load) system

The invention belongs to the technical field of data synchronism, and particularly discloses a data synchronism ETL (Extract Transform Load) system. The data synchronism ETL system comprises the following function modules: a context parameter configuration module, a synchronous interface definition module, a synchronous script generation module, a workflow configuration module, a data synchronism core module and a log and prewarning module, wherein the context parameter configuration module, the synchronous interface definition module and the synchronous script generation module are basic modules and are operated according to the sequence of the modules to generate a script; the workflow configuration module mainly plays the role of a dispatching center, and configures the script in a workflow to realize parallel or serial execution, parallel progress amount, and what time or condition a task is triggered; when anomaly or an error task occurs in the execution steps, the log and prewarning module captures an error reason, and informs system management personnel of the error reason. The data synchronism ETL system is provided with a bidirectional synchronism mechanism, and is capable of supporting data sources configured with multiple context names and accessing multiple environments, the coverage of the synchronism method is wide, meanwhile, a complicated user-defined synchronism method is supported, and the expansibility is stronger.
Owner:SHANGHAI HANDPAL INFORMATION TECH SERVICE

Post-migration validation of etl jobs and exception management

Handling extract-transform-load (ETL) job mismatches as “exceptions.” Exception handling may include the following steps: (i) determining a mismatch while running an extract-transform-load job with the mismatch being a mismatch of at least one of the following types: design time information mismatch, and / or operational metadata mismatch; and (ii) responsive to determining the mismatch, handling the mismatch as an exception.
Owner:IBM CORP

Multidimensional model based oil and gas resource data key system implementation method and system

The invention discloses a multidimensional model based oil and gas resource data key system implementation method and system. According to the multidimensional model based oil and gas resource data key system implementation system, sorting, cleaning, classification and systematization are performed on source data of oil and gas resources, cleaning, transformation and integration are performed based on an ETL (Extract-Transform-Load) framework, the source data are loaded into a data warehouse or a market to form a core database, and the data processing efficiency is improved and data mining is performed through establishment of a multidimensional model. According to the multidimensional model based oil and gas resource data key system implementation method, different latitudes of the data can be flexibly combined, the custom analysis on core data such as the oil price, the yield, reserves, the quantity of the resources, import and export and the mining right is achieved, integrated management and visual display of statements, graphics, texts and maps are supported, interfaces and function modules of the system are based on a configuration scheme, and accordingly the configuration is flexible, the extension can be performed, the use is easy, the display form of the data is enriched, and the retrieval efficiency is improved.
Owner:CHINA UNIV OF GEOSCIENCES (BEIJING)

Processing data in data migration

A computer-implemented method for processing information related to an extract-transform-load (ETL) data migration, including aggregating operational metadata and determining: a plurality of metrics, organized by business object, corresponding to the migration; a number of business object instances not successfully loaded; a first end-to-end execution time for at least one business object; relevant input metadata; load readiness status per business object; impact of a business object that is not load ready by analyzing business process hierarchies; business object load readiness by reference to incomplete development status or data defects; scope per test cycle based, at least in part, upon business object load readiness; and high-priority defects of business objects that stop testing based, at least in part, upon analysis of business process hierarchies.
Owner:IBM CORP

Parallel processing for etl processes

A technique for parallel processing of data from a plurality of data sources in conjunction with an Extract-Transform-Load (ETL) process, the data being part of a related data set, which comprises the following: staging a unit of extracted data from each of the plurality of data sources, thereby generating a plurality of units of staged data; identifying a plurality of tasks relating to transforming the staged data; assigning a subset of the tasks to each of a plurality of child processes being managed by a master process, such that dependent tasks are assigned to a same child process; concurrently executing the subsets of tasks assigned to the child processes, thereby generating a plurality of units of transformed data from the plurality of units of staged data; and publishing the transformed data after all tasks are completely executed, thereby ensuring that the published data represent the related data set.
Owner:OATH INC

Data quality measurement for etl processes

Techniques for maintaining data quality of transformed data generated using an Extract-Transform-Load (ETL) process and stored in at least one data warehouse, the method comprising generating a quality metric for each of a plurality of units of the transformed data with reference to at least one data quality measurement rule, the quality metric for each unit of the transformed data representing a validity measure defined by the corresponding data quality measurement rule; and generating a report organizing the quality metrics for selected units of the transformed data.
Owner:OATH INC

Method and system to load information in a general purpose data warehouse database

A method and a system for loading information from a production database into a general purpose DW database comprising at least one table for each type of object defined in its schema, said method first consisting in collecting information from the production database and for each object to be checked identified in the collected information, building an indexed temporary database having as primary keys, the primary keys of the object to be checked. The checking for the existence of the object performed before any insertion or modification of an existing object is done in the temporary database instead of the general purpose DW database. This highly improves performances in the loading operations executed during the DW ETL (Extract, Transform and Load) procedures. If the temporary table is transformed from vertical to horizontal the step for executing the checking of existence are even reduced and the performance of ETL even more improved.
Owner:IBM CORP

Parallel Processing of ETL Jobs Involving Extensible Markup Language Documents

Techniques for running an Extract Transform Load (ETL) job in parallel on one or more processors wherein the ETL job comprises use of an extensible markup language (XML) document are provided. The techniques include receiving an XML document input, identifying a node in the XML document at which partitioning of the XML document is to begin, sending partition information to each respective processor, performing a shallow parsing of the XML document in parallel on the one or more processors, wherein each processor performs shallow parsing using the identified partition node until it reaches its identified partition, using the shallow parsing to generate the partition of the input XML document, wherein each processor generates a different partition of the same XML document, and sending each partition in streaming format to an ETL job instance.
Owner:GLOBALFOUNDRIES INC

Method and system for monitoring ETL (extract-transform-load) data processing process

The application discloses a method and system for monitoring the ETL (extract-transform-load) data processing process. The method comprises the following steps: determining the field type of output data of the ETL data processing process according to ETL data processing task information; generating the monitoring indexes of the ETL data processing process according to the field type of the outputdata; and counting or calculating the corresponding fields in the output data of the ETL data processing process according to the generated monitoring indexes to obtain the result values of the monitoring indexes; wherein the monitoring indexes indicate a process for counting or calculating the appointed fields in the output data of the ETL data processing process according to a specified manner.The method provided in the application can solve the problems of the ETL data processing process monitoring method in the prior art, such as low efficiency and poor accuracy.
Owner:TAOBAO CHINA SOFTWARE

Method and system for dynamically converting telecommunications service data

The invention discloses a method and a system for dynamically converting telecommunications service data. The method comprises the following steps of: obtaining a preset telecommunications service data converting rule; dynamically generating a JAVA source code in a service realizing type according to the data converting rule and a preset service realizing type template; compiling and generating aJAVA program code; dynamically loading and executing the service realizing type; and generating a source data file into an object data file. The invention aims at the characteristics of the telecommunications service data to establish a rich function set, provides a function self-defining interface at the same time and is convenient to user self definition and function set expansion. Different from the current script interpretative type ETL (Extract-Transform-Load) tool, the invention adopts dynamic compiling, dynamic loading and a JAVA reflection technology, increases data conversion efficiency and has stronger cross-platform property and application flexibility.
Owner:CHINA TELECOM CORP LTD

System and methods for integrated performance measurement environment

A system for integrated performance measurement environment comprises a plurality of distributed file systems; a database, wherein the database further comprises a layout details table, a layout definition table, and at least one raw data master file; at least one network; an extract-transform-load module in a memory; and a business logic module in the memory, wherein the business logic module further comprises metrics catalog and business intelligent plug-in blocks; and a collaboration module in the memory; and methods performing the same including extracting, transforming, and loading data into the system, retrieving data in accordance with rules in metrics catalog and business intelligent plug-in blocks, drilling down the metrics catalog and business intelligent plug-in blocks, and collaborating over the system.
Owner:OPENMETRIK INC

Bi-directional replication between web services and relational databases

ActiveUS20090063504A1Efficiently and accurately transferringLikelihood of out and unsuccessfulDigital data processing detailsWebsite content managementEnterprise application integrationRelational database
A method for bi-directional data replication between a Web Service application and a relational database are provided. Techniques of Enterprise Application Integration (EAI) and Extract Transform Load (ETL) technology are employed to create a relational database schema, load the schema, synchronize the structure and the content of the schema, and replicate changes in the content of the schema to the web services application. Optional advanced techniques to support reporting, legacy data migration, and integration with other applications are also provided.
Owner:SESAME SOFTWARE

Resilient data processing pipeline architecture

A private cloud-based processing pipeline apparatus and method for its use is disclosed. A first load balancer directs data packets to one of a plurality of collectors. The collectors authenticate the data packet. Then a second load balancer receives the data packet from the collector and to direct the data packet to one of a plurality of extract transform load (ETL) frontends, where the data packet is converted from a platform dependent form into a platform independent form and loaded into a queue. Handlers then process the converted data packets. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
Owner:SONY COMPUTER ENTERTAINMENT INC

User portraying method based on mass cross-screen behavior data

The invention discloses a user portraying method based on mass cross-screen behavior data. The method mainly comprises the steps that in the face of bidirectional new multimedia business which is increasingly flexible and millions and even ten millions of mass behavior data of users, collected user behavior data is subjected to HDFS distributed storage; after the data is extracted, converted and loaded through an ETL module, an optimal combinational algorithm conforming to the characteristics of the media industry is used to perform efficient data preprocessing on mass user behavior data fused with content tags, user tags, consumption tags, geotags, equipment tags, user properties etc.; and finally, user portraits are formed, and then relevant user portraits are called through a WEB application. The method provides precise data support for business operation by a broadcast television network operator.
Owner:SHANGHAI STAR V DATA TECH

Data ETL (Extract Transform Load) system based on storm and treatment method based on storm

The invention discloses a data ETL (Extract Transform Load) system based on storm and treatment method based on storm and belongs to the technical field of data ETL management. The system is divided into a controller module, a connector module and a distributed calculation engine, wherein the controller module is used for receiving a user command, analyzing command setting and starting a data ETL task; the connector module is internally provided with connection drives of a relational database, an Hbase database and an HDFS (Hadoop Distributed File System) and can be called when the distributed calculation engine is connected with a data source and a target data storage; the storm is used as the distributed calculation engine and is used for receiving parameters set by the controller module to carry out a data ETL task. A user does not need to compile a storm code and only needs to input the command; the controller module is used for analyzing the user command, the storm is automatically set and the ETL task is issued; all the supportable connection drives of the data source and the target data storage are packaged in a connector and are automatically selected and called by a controller.
Owner:SHANDONG INSPUR SCI RES INST CO LTD

Methods, systems, and computer program products for managing batch operations in an enterprise data integration platform environment

Methods, system, and computer program products for managing batch operations are provided. A method includes defining a window of time in which a batch will run by entering a batch identifier into a batch table, the batch identifier specifying a primary key of the batch table and is configured as a foreign key to a batch schedule table. The time is entered into the batch schedule table. The method further includes entering extract-transform-load (ETL) information into the batch table. The ETL information includes a workflow identifier, a parameter file identifier, and a location in which the workflow resides. The method includes retrieving the workflow from memory via the workflow identifier and location, retrieving the parameter file, and processing the batch, according to the process, workflow, and parameter file.
Owner:AT&T INTPROP I L P

ETL (extract-transform-load) data processing method, device and system

The invention discloses ETL (extract-transform-load) data processing method, device and system. The method includes: acquiring source data from a data source, and converting the source data into source data in the CSV (comma-separated value) format; establishing an ETL task according to the source data in the CSV format, and triggering an ETL tool to run the ETL task; transmitting running results of the ETL task to a target data warehouse. The data, to be processed, of the data source is uniformly converted in the CSV format and is then extracted, transformed and loaded; loading results are disposed in the target data warehouse; in case of varieties of data sources, adding a conversion method of CSV-formatted data is required only; accordingly, setting of various drive / import / export tools causes decreased processing speed of ETL data in the prior art is avoided, and processing speed and efficiency of the ETL data can be higher.
Owner:杭州勒卡斯广告策划有限公司

Multithreading data processing method based on ETL (Extract Transform Loading)

The invention discloses a multithreading data processing method based on ETL (Extract Transform Loading), comprising the following steps of: dividing the data extracting process of ETL into three obvious stages: extraction, sending and synchronization, collaterally executing the extraction, the sending and the synchronization of data by using respective independent threading, and persisting error data. The invention parallelizes the extraction process of the ETL data, greatly improves the throughput and extraction rate and the use ration of hardware resources through using a multithreading processing frame, also improves the error tolerance of the data, and reduces the probability of causing the whole ETL paralysis because errors are generated in the data extraction process through processing the error data generated in the extraction, sending and the synchronization processes of the data.
Owner:山东中创软件商用中间件股份有限公司

Systems and methods for detecting and preventing cyber-threats

A system (100) for detecting and preventing cyber-threats is disclosed. The system (100) can include an online-analytical-processing (OLAP) resource (102) coupled to a data mining engine (104), a reporting resource (106) and a processor (108). The processor (108) can run instructions stored within an extract-transform-load (ETL) module (112). The ETL module (112) can enable the processor (108) to extract one or more data tuples various data sources (110). The ETL module (112) can enable the processor to transform the extracted tuple(s).
Owner:ZENOPZ LLC

Managing software asset environment using cognitive distributed cloud infrastructure

A method and system are provided for performing an extract-transform-load (ETL) process. The method includes collecting load information about a volume and a complexity of raw data to be processed during the ETL process. The method further includes receiving an expected completion time of the ETL process and execution information about (i) hardware resources and (ii) an influence of the hardware resource on an execution time of the ETL process. The method also includes calculating resources for a distributed processing software infrastructure to be used to perform the ETL process, by applying a statistical method to the load information, expected completion time, and execution information. The method additionally includes dynamically assigning cloud resources corresponding to and based on the calculated resources, in accordance with the expected completion time. The method further includes performing the ETL process on the raw data using the assigned cloud resources and storing ETL process results.
Owner:IBM CORP

ETL (Extract-Transform-Load) realization method and system based on metadata

The invention provides an ETL (Extract-Transform-Load) realization method based on metadata. The ETL realization method comprises the following steps: obtaining a pre-programmed ETL request; converting the ETL request to form ETL metadata according to a data conversion rule; leading the ETL metadata into a pre-established ETL metadata model to store the ETL metadata; generating an ETL configuration file according to the ETL metadata; and compiling the generated ETL configuration file into executable ETL JOB. According to the ETL realization method, the ETL management efficiency is greatly improved and convenience is provided for development and utilization of a data warehouse; the technical defects that faults can be checked only if the development is complete when an ETL tool is used are effectively solved; the standardized metadata model can very clearly show an ETL process and a developer can conveniently tease and check a current simulated ETL flow; the understanding on the ETL requirements by the developer are deepened and the developer is helped to find an optimal ETL design scheme; the finally obtained ETL JOB can be accurately and efficiently executed.
Owner:SHANGHAI TOBACCO GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products