The invention provides a multi-energy park scheduling method and
system based on double-layer
reinforcement learning. The method comprises the steps of: obtaining scheduling controllable objects, i.e., a source side unit, a load side unit, an energy conversion unit and a storage unit, in an
integrated energy system, constructing a double-layer optimization
decision model which comprises an upper-layer
reinforcement learning sub-model and a lower-layer mixed integer
linear programming sub-model, enabling the upper-layer
reinforcement learning sub-model to acquire action variable information ofthe storage unit under the
state variable information at the current moment and transmit the action variable information to the lower-layer mixed integer
linear programming sub-model, enabling the lower-layer mixed integer
linear programming sub-model to acquire a corresponding award variable and
state variable information of the storage unit at the next moment, and feed back the award variable and the
state variable information to the upper-layer reinforcement learning sub-model, and iteratively executing the above steps until the scheduling is finished. According to the embodiment of the invention, through a data-driven reinforcement learning method, a decision only needs to be made according to the current state, future information does not need to be predicted, the decision timelinessis high, the decision effect is excellent, and a real-time optimization decision can be realized.