The invention relates to the electronic communications and video image processing field, and provides a video summary generation method, a system and a device. The method comprises the following steps: A, to receive and divide the inputted video to get candidate time point sequences; B, through the scene segmentation algorithm, to get a jumping time point sequence selected from the candidate time point sequences; C, according to the jumping time point sequence, to extract the video segment corresponding to each jumping time point, and to combine the all the video segments into a video summary and output the summary. In the video summary generation process, the invention firstly acquires the eigenvector of each video frame and selects the jumping time point sequence through hierarchical clustering, and finally extracts the corresponding video frames to form the video summary based on the jumping time point sequence, so as to cover the scene as much as possible and achieve the biggest difference among the video frames to enhance the completeness of the information of the video summary; in addition, the invention has no requirements to video types so as to improve the universality in technology application.