The invention provides an expressway unmanned vehicle formation method based on multi-agent reinforcement learning, the method regards a vehicle formation problem as a multi-agent cooperation problem, and each vehicle has an independent decision-making capability. The method can achieve flexible formation on the premise of safe and rapid driving, namely, safe obstacle avoidance is achieved when the traffic flow is large while the formation does not need to be kept, and the formation is recovered when the traffic flow is small; an end-to-end mode of directly mapping image input to vehicle control quantity is large in training difficulty due to large action search space, so that a lane changing strategy is learned only by using a multi-agent reinforcement learning method, the accurate control quantity is calculated in combination with an S-T graph trajectory optimization method, so that control constraints are increased, the kinematics principle is respected, safety guarantee is achieved, and human driving habits are met.