Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data processing method and data processing device

A processing method and processing device technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of high learning cost, inconvenient use, and affecting the processing efficiency of massive unstructured data, so as to reduce duplication The effect of sex and complexity reduction

Inactive Publication Date: 2013-12-04
SUGON INFORMATION IND
View PDF4 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This data processing method has high learning costs and is inconvenient to use. Each application logic needs to write different job codes, which is difficult to reuse and affects the processing efficiency of massive unstructured data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and data processing device
  • Data processing method and data processing device
  • Data processing method and data processing device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] see figure 1 , a data processing method is provided in an embodiment of the present invention, including:

[0026] 101. Obtain the analysis method of unstructured data;

[0027] 102. Parse the unstructured data according to the parsing method to obtain patterned data;

[0028] 103. Call the pre-packaged SQL-like functional class for processing unstructured data;

[0029] 104. Process the data with the schema according to the SQL-like function class.

[0030] Among them, the patterned data refers to the data output according to the specified output mode.

[0031] Optionally, the prepackaged SQL-like functional classes for processing unstructured data include one or more of the following: statistics, filtering, data migration and conversion, and sorting.

[0032] In another embodiment, before the method of acquiring unstructured data, it also includes:

[0033] Performing SQL-like processing on unstructured data, wherein the SQL-like processing includes: creating the...

Embodiment 2

[0039] This embodiment provides a data processing method, which decomposes basic operations of unstructured data processing into basic operators, and provides SQL-like standard syntax for users to implement operator assembly. Due to the uncertainty of unstructured data processing, users are allowed to embed some custom calculations. In this way, the combination of dynamic processing and static assembly not only reduces the complexity of unstructured data processing implementation, but also does not lose its limitations. In this embodiment, a large amount of unstructured data is stored in a distributed system as an example for illustration. Other storage methods, such as NoSQL and DB, will not affect the implementation of this technology as long as the corresponding data reading driver is implemented. This is not specifically limited in this embodiment.

[0040] see figure 2 , the specific method flow includes:

[0041] 201. Perform SQL-like processing on unstructured data....

Embodiment 3

[0072] see image 3 In this embodiment, a data processing device is provided, including: an acquisition module 301 , an analysis module 302 , a SQL-like calling module 303 and a data processing module 304 .

[0073] An acquisition module 301, configured to acquire the analysis method of unstructured data;

[0074] An analysis module 302, configured to analyze the unstructured data according to the analysis method to obtain patterned data;

[0075] The SQL-like calling module 303 is used to call pre-packaged SQL-like functional classes for processing unstructured data;

[0076] The data processing module 304 is configured to process the schematized data according to the SQL-like function class.

[0077] Optionally, the prepackaged SQL-like functional classes for processing unstructured data include one or more of the following: statistics, filtering, data migration and conversion, and sorting.

[0078] see Figure 4 , in another embodiment, the device also includes:

[007...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data processing method and a data processing device and belongs to the technical field of mass data processing. The method comprises the following steps: acquiring an unstructured data resolving mode; resolving unstructured data according to the resolving mode so as to obtain modal data; calling a similar SQL (Structured Query Language) function class for processing pre-packaged unstructured data; processing the modal data according to the similar SQL function class. According to the method, the unstructured data are packaged into unified operators by virtue of classification of treatment; by virtue of designing similar SQL grammar, data processing is realized by virtue of calling in a similar SQL sentence, SQL data processing thought is applied to unstructured data processing, a package basic operation and a combination basic operation are adopted for realizing various complicated operation and reducing repeated operation, so that the complexity of realizing unstructured data processing is greatly reduced on the premise of not limiting the application range.

Description

technical field [0001] The invention relates to the technical field of massive data processing, in particular to a data processing method and device. Background technique [0002] With the development of data services, massive amounts of unstructured data have emerged. Unstructured data is data without a clear type and format, including all formats of office documents, text, pictures, XML, HTML, various reports, images and audio / video information, etc. [0003] In the prior art, there is a method for processing massive unstructured data, including: writing a MapReduce job and submitting it to the MapReduce framework for execution. During data processing, users need to manage details such as data input parsing, output format setting, definition of intermediate calculations, and consistency of data types. This method has relatively high requirements for users. Users need to understand the working principle of MapReduce and be able to write correct MapReduce jobs based on thi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 王颖李晋钢宋怀明苗艳超刘新春邵宗有
Owner SUGON INFORMATION IND
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products