A tcp stream recombination method based on hadoop platform and distributed processing programming model

A technology of distributed processing and programming model, which is applied in the direction of error prevention/detection using return channel, digital transmission system, electrical components, etc., to achieve the effect of reducing overhead and improving operating efficiency

Active Publication Date: 2017-07-28
CHONGQING UNIV OF POSTS & TELECOMM
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, there is still a lack of algorithms for TCP stream reassembly on the Hadoop platform

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A tcp stream recombination method based on hadoop platform and distributed processing programming model
  • A tcp stream recombination method based on hadoop platform and distributed processing programming model
  • A tcp stream recombination method based on hadoop platform and distributed processing programming model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] A non-limiting embodiment is given below in conjunction with the accompanying drawings to further illustrate the present invention. It should be understood, however, that these descriptions are exemplary only, and are not intended to limit the scope of the invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concept of the present invention.

[0026] like figure 1 As shown, the present invention needs a MapReduce task, massive data is all stored in HDFS with the form of block (default 64MB), revises InputFormat and completes the mapping of fragmentation to key-value pair, and the input key-value pair of Map is , the output key-value pair is . The output of the Map goes through the Shuffle intermediate process to complete the partitioning, sorting, and merging of the output key-value pairs. Gather the "timestamp+serial number+packet payload" of the same five-tuple in the Map...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a TCP stream recombination method based on Hadoop platform and distributed processing programming model, and the input key-value pair of Map is <偏移量,二进制数据包>, the output key-value pair is <五元组,时间戳+序列号+数据包有效净荷>. The "+" operation means that the timestamp, serial number, and payload of the data packet are combined into a large byte array, and finally the "timestamp + serial number + payload of the packet" is saved as the BytesWritable that comes with Hadoop type of data. The output of the Map goes through the Shuffle intermediate process to complete the partitioning, sorting, and merging of the output key-value pairs. Gather the "timestamp + sequence number + packet payload" of the same five-tuple in the Map output to form a key-value pair <五元组,list(时间戳+序列号+数据包有效净荷)>As the input of Reduce. The final output key-value pair of Reduce is <五元组,重组数据>. The invention improves operating efficiency and reduces overhead.< / 五元组,重组数据>

Description

technical field [0001] The invention relates to the field of network big data flow analysis. Specifically, it is a TCP stream recombination method based on Hadoop platform and distributed processing programming model. Background technique [0002] TCP is a connection-oriented and reliable transport layer protocol, and has been widely used in the Internet and networks that require high transmission reliability. Due to the complex layers of the Internet protocol stack and the limited length of a single data packet, the application layer data is likely to be divided into multiple fragments, and multiple data packets are responsible for transmission. Therefore, before analyzing the data of the application layer, reorganizing the TCP session is a necessary prerequisite. [0003] The traditional TCP reassembly technology uses data structures such as linked lists and hash tables, combined with TCP quintuples, confirmation numbers, serial numbers, and various identification bits (...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L29/08H04L12/861H04L1/16
Inventor 雒江涛高伟杨军超王小平邓生雄申健刘勇
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products