This application discloses a decentralized self-discipline optimization method for multiple virtual power plants, including: S1, parameter initialization; S2, task classification and formation of an initial knowledge matrix; S3, information acquisition; S4, determination of optimal individual actions; S5, calculation of each agent S6, calculating the reward function; S7, updating the knowledge matrix; S8, information feedback: each agent returns the current optimal solution to the information center; S9, judging whether the maximum number of iterations is reached, and if so, output The optimal knowledge matrix for the corresponding task; otherwise, return to S3. This application discloses a distributed self-discipline optimization method for multi-virtual power plants, which solves the problem that the existing distribution network regulation is difficult to meet the real-time participation of multiple virtual power plants in the power market for profit, and effectively control the grid-connected behavior of distributed equipment to support the distribution network. Technical aspects of safe and efficient operation.