The present invention provides a comprehensive method for automatically and unobtrusively analyzing the in-store behavior of people visiting a
physical space using a multi-
modal fusion based on multiple types of sensors. The types of sensors employed may include cameras for capturing a plurality of images and mobile
signal sensors for capturing a plurality of Wi-Fi signals. The present invention integrates the plurality of input sensor measurements to reliably and persistently track the people's physical attributes and detect the people's interactions with retail elements. The physical and contextual attributes collected from the processed shopper tracks includes the
motion dynamics changes triggered by an implicit and explicit interaction to a retail element, comprising the behavior information for the trip of the people. The present invention integrates point-of-sale
transaction data with the shopper behavior by finding and associating the
transaction data that corresponds to a shopper trajectory and fusing them to generate a complete an intermediate representation of a shopper trip data, called a TripVector. The shopper behavior analyses are carried out based on the extracted TripVector. The analyzed behavior information for the shopper trips yields exemplary behavior analysis comprising map generation as
visualization of the behavior, quantitative shopper metric derivation in multiple scales (e.g., store-wide and category-level) including path-to-purchase shopper
metrics (e.g., traffic distribution, shopping action distribution, buying action distribution, conversion funnel), category dynamics (e.g., dominant path, category correlation, category sequence). The present invention includes a set of derived methods for different sensor configurations.