A method for video compression through
image processing and
object detection, to be carried out by an electronic
processing unit, based either on images or on a
digital video stream of images, the images being defined by a
single frame or by sequences of frames of said video
stream, with the aim of enhancing and then isolating the
frequency domain signals representing a content to be identified, and decreasing or ignoring the
frequency domain noise with respect to the content within the images or the video
stream, comprises the steps of: obtaining a
digital image or a sequence of digital images from either a corresponding
single frame or a corresponding sequence of frames of said video stream, all the digital images being defined in a
spatial domain; selecting one or more pairs of sparse zones, each covering at least a portion of said
single frame or at least two frames of said sequence of frames, each pair of sparse zones generating a selected feature, each zone being defined by two sequences of spatial data; transforming the selected features into
frequency domain data by combining, for each zone, said two sequences of spatial data through a 2D variation of an L-transformation, varying the
transfer function, shape and direction of the frequency domain data for each zone, thus generating a normalized complex vector for each of said selected features; combining all said normalized complex vectors to define a model of the content to be identified; and inputting that model from said selected features in a classifier, therefore obtaining the data for
object detection or
visual saliency to use for video compression.