The invention discloses a
data stream classification method based on
machine learning, which comprises the following steps: 1) capturing and filtering data streams on a network according to an input rule to obtain data packets meeting conditions; 2) establishing a
data stream according to the quintuple information of the data packet, establishing an application
stream in combination with the reverse
data stream, extracting specified application
stream feature information, and recording the application
stream feature information in an application stream table; step 3) detecting whether the application flow completes an interaction process or not; if the application flow feature information is completed, packaging the application flow feature information into feature vectors, calling a
machine learning classifier for classification to obtain a
label La, entering a step 4), and otherwise, identifying the
classification result of the application flow as an unknown application; and 4) searching an association information table to which the current application flow belongs, and determining a final
classification result of the current application flow by combining
machine learning classification information of historical application flows in the table. The method provided by the invention can effectively improve the classification accuracy of the current data flow.