Data synchronization method, system and device based on distributed system and storage medium

A distributed system and data synchronization technology, applied in the field of data processing, can solve problems such as timeout and low efficiency, achieve the effects of convenient debugging, reduce code configuration complexity, and improve data synchronization efficiency

Pending Publication Date: 2020-12-11
CTRIP COMP TECH SHANGHAI
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the above synchronization methods have problems such as low efficiency and timeout, and the larger the amount of data, the more prominent these problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data synchronization method, system and device based on distributed system and storage medium
  • Data synchronization method, system and device based on distributed system and storage medium
  • Data synchronization method, system and device based on distributed system and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

[0045] Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus repeated descriptions thereof will be omitted. Some of the block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data synchronization method, system and device based on a distributed system and a storage medium. The method comprises the steps that a Shell script is operated to synchronize data of a Hive database into a ClickHouse database; the method for synchronizing the data of the Hive database to the ClickHouse database by running the Shell script comprises the following steps of: obtaining a file path of data to be synchronized from a source Hive table of the Hive database; storing the file path of the to-be-synchronized data in a Shell array; and synchronizing the data to be synchronized in the source Hive table to a target ClickHouse table of a ClickHouse database according to the file path in the Shell array. According to the method, the data synchronization process can be simplified, the code configuration complexity is reduced, debugging is convenient, the data synchronization efficiency is improved, the data use efficiency is improved, and therefore automatic high-efficiency data synchronization is achieved.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a data synchronization method, system, device and storage medium based on a distributed system. Background technique [0002] At present, the Internet industry basically uses the Hadoop framework (Hadoop is a distributed system infrastructure developed by the Apache Foundation), and Hive is a data warehouse tool based on this framework. The Hive data table corresponds to the underlying ORC (Optimized Row Columnar, optimized ranks) file. Hive data is synchronized to the ClickHouse database (Clickhouse is a columnar database management system for online analysis and processing) is generally implemented through a transfer method, that is, the data is first transferred from Hive to Hbase (a distributed, column-oriented open source database), Then transmit to ClickHouse, or read Hive data through the JDBC (Java Database Connectivity, Java Database Connection) engine, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/27
CPCG06F16/27Y02D10/00
Inventor 叶小琴吉聪睿
Owner CTRIP COMP TECH SHANGHAI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products