ArticlePDF Available

A business application of RTLS technology in Intelligent Retail Environment: Defining the shopper's preferred path and its segmentation

Authors:

Abstract and Figures

Over the last few years, shopper behaviour analysis in the retail environment has become an interesting topic both for managers who want to see the tangible impact of their trade marketing activities and researchers who are trying to identify new patterns or confirm known trends in this field. In such a context, technologies today play a central role, because of the possibility of implicitly observing how shoppers move inside the store, and collecting a wide data-set, through an unbiased approach, free from distortion. In this paper, we will describe the major outcomes from a study based on data collected through an innovative technology, Real Time Locating System (RTLS). We base our conclusions on a data-set, collected over three months of observations, composed of more than 18 million records transmitted by RTLS tags, monitoring the entire path of each shopper throughout the entire store area. The outcomes of our study are 1) the identification of the store's best performing areas based on traffic and dwell time metrics, 2) the development of a novel method to estimate the probability of in-store shopper paths and 3) a preliminary shopping trip segmentation.
Content may be subject to copyright.
A preview of the PDF is not available
... The work of Ferracuti et al. [7] concerns the retail environment and uses Real-Time Locating System (RTLS) tags to collect human trajectory data. The tags were used to infer visitors' preferred paths and their segmentation. ...
... The data were collected with a tracking system based on Ultra Wideband (UWB) technology, with tags embedded in shopping carts. The UWB is suitable for applications where positioning accuracy is a critical issue [7]. This technology uses some UWB antennas that are suitably placed in a fixed area and battery-powered tags that can freely move in the area [28]. ...
Article
Full-text available
Public space is usually conceived as where people live, perceive, and interact with other people. The environment affects people in several different ways as well. The impact of environmental problems on humans is significant, affecting all human activities, including health and socio-economic development. Thus, there is a need to rethink how space is used. Dealing with the important needs raised by climate emergency, pandemic and digitization, the contributions of this paper consist in the creation of opportunities for developing generative approaches to space design and utilization. It is proposed GREEN PATH, an intelligent expert system for space planning. GREEN PATH uses human trajectories and deep learning methods to analyse and understand human behaviour for offering insights to layout designers. In particular, a Generative Adversarial Imitation Learning (GAIL) framework hybridised with classical reinforcement learning methods is proposed. An example of the classical reinforcement learning method used is continuous penalties, which allow us to model the shape of the trajectories and insert a bias, which is necessary for the generation, into the training. The structure of the framework and the formalisation of the problem to be solved allow for the evaluation of the results in terms of generation and prediction. The use case is a chosen retail domain that will serve as a demonstrator for optimising the layout environment and improving the shopping experience. Experiments were assessed on shoppers’ trajectories obtained from four different stores, considering two years.
... Both RFID and UWB are based on tracking tags. Compared to RFID, UWB has several advantages, such as that tracking with UWB leads to more accurate results (see Ferracuti et al. 2019 for an overview). ...
... Cameras require good lighting and an unobstructed line-of-sight to achieve high accuracy (Liu, Yanlei, and Kamijo 2015), which is difficult to find in a retail environment. In contrast, radio signals can propagate unimpeded through walls, clothing, and equipment without causing interference (Ferracuti et al. 2019). ...
... In the near future, it is certain that the Brazilian supermarket retail sector will undergo a series of competitive adjustments, pressured by irreversible competition (Bach et al., 2020;Saab and Gimenez, 2000;Varotto, 2018) that will lead companies to exploit new competitive advantages. Ferracuti et al. (2019) evidenced the importance of identifying consumers' behavior patterns and anticipating new consumer trends in retail (Ferracuti et al., 2019). For this purpose, information stored in databases, supported by database marketing techniques (Lian et al., 2019), has proved to be a fundamental element in contributing to the marketing strategies of organizations in order to target each profile with the most suitable stimuli (Schiffman and Kanuk, 1991). ...
... In the near future, it is certain that the Brazilian supermarket retail sector will undergo a series of competitive adjustments, pressured by irreversible competition (Bach et al., 2020;Saab and Gimenez, 2000;Varotto, 2018) that will lead companies to exploit new competitive advantages. Ferracuti et al. (2019) evidenced the importance of identifying consumers' behavior patterns and anticipating new consumer trends in retail (Ferracuti et al., 2019). For this purpose, information stored in databases, supported by database marketing techniques (Lian et al., 2019), has proved to be a fundamental element in contributing to the marketing strategies of organizations in order to target each profile with the most suitable stimuli (Schiffman and Kanuk, 1991). ...
... The process operates as a continuous learning cycle. An advantage of this framework is that it allows retailers to test store design predictions such as the traffic flow behavior when customers enter a store or the popularity of store displays placed in different areas of the store (Ferracuti et al. 2019;Underhill). ...
... The execute layer in the ACT phase employs the analytical results and insights from the analytic layer to take actions by improving layout, measuring the success of the improved layout, evaluating the results obtained and continuous revision of the created layout. Two examples of using the STAL framework would be studying maps of customer density or time spent in stores (see Ferracuti et al. 2019) to generate optimal layouts. Layout variables managers can consider include store design variables (e.g., space design, point-of-purchase displays, product placement, placement of cashiers), employees (e.g., number, placement), and customers (e.g., crowding, visit duration, impulse purchases, use of furniture, waiting queue formation, receptivity to product displays). ...
... As a consequence, indoor positioning systems and indoor LBSs are experiencing a growing interest from both academics and industry: a recent market report predicted a compound annual growth rate of 22.4% between 2022 and 2027 [3]. This growing interest has been observed in a variety of use cases, such as healthcare, where they increase productivity by locating more efficiently people or assets [4], [5]; retail stores, where they are used to evaluate shopping behavior, marketing techniques, and inventory management [6], [7]; navigation of people and autonomous vehicles in search and rescue missions [8], [9]; cargo tracking and fleet management in logistics and transportation services [10], [11]; providing way-finding, multimedia guides, and content recommendation for augmented reality applications in museums [12], [13]; smart environments, where it can support ambient assisted living in addition to navigation guidance [14], [15]; and, more recently, contact tracing applications in a pandemic, aiding the health services by automatically identifying close contacts [16], [17]. ...
Article
Positioning systems have become increasingly popular in the last decade for location-based services such as navigation and asset tracking and management. As opposed to outdoor positioning, where the Global Navigation Satellite System became the standard technology, there is no consensus yet for indoor environments despite of the availability of different technologies, such as radiofrequency, magnetic field, visual light communications, or acoustics. Within these options, acoustics emerged as a promising alternative to obtain high-accuracy low-cost systems. Nevertheless, acoustic signals have to face very demanding propagation conditions, particularly in terms of multipath and Doppler effect. Therefore, even if many acoustic positioning systems have been proposed in the last decades, it remains an active and challenging topic. This paper surveys the developed prototypes and commercial systems that have been presented since they first appeared around the 1980s, to 2022. We classify these systems into different groups depending on the observable they use to calculate the user position, such as the Time-Of-Flight, the Received Signal Strength, or the acoustic spectrum. Furthermore, we summarize the main properties of these systems in terms of accuracy, coverage area and update rate, among others. Finally, we evaluate the limitations of these groups based on the link budget approach, which gives an overview of the system’s coverage from parameters such as source and noise level, detection threshold, attenuation, and processing gain.
... User constraints can be classified into: cluster-level constraints, specifying requirements on cluster or instance-level constraints, and specifying requirements on instance pairs. An instance-level constraint (also called a pairwise constraint) is a constraint on pairs of instances (Ferracuti et al., 2019). There are two types of instance-level constraints that were first introduced by, bound and nonlinkable constraints. ...
Article
Full-text available
Identification of customers in the business sector that really needs to be done as an evaluation of a business that is run so that it can continue to grow and be able to follow business developments in the same sector. The deep constraint clustering approach is used to cluster customers towards a business. In this study, a clustering of customers using rail mass transportation will be carried out. The results achieved are the formation of 6 clusters using trains be built. The result of research expected to be a consideration in improving services to the company
... 2 | RELATED CONCEPTS AND THEORETICAL BASIS 2.1 | Synergy benefits of new retail and supply chain 1. The retail industry is where commodity producers (workers, farmers) sell their produced products to society through commodity transactions (Ferracuti et al., 2019;Santoro et al., 2019) to meet their necessary life needs and social needs. This type of industry responsible for the sale of goods is called the retail industry (Byun et al., 2020). ...
Article
Full-text available
With the rapid development of the Internet of Things, the Internet, and artificial intelligence, people's consumption patterns have undergone major changes. Consumers' consumption channels are no longer monolithic. A new retail model was born. Additionally, how to improve the revenue of the new retail supply chain and the scientific and reasonable distribution of revenue is particularly important. On this background, the concept of new retail is defined, and the principle of supply chain synergy is analysed. Stackelberg game is used to analyse and determine the optimal selling and wholesale prices of retailers and suppliers in the supply chain. The defects of the traditional supply chain model and the necessity of optimization are further analysed and demonstrated. Because of the traditional retail supply chain, new retail channels are introduced, and the second‐order supply chain is established. New retail channels are added to the supply chain, and the profitability before and after the addition is analysed. Stackelberg game and Shapley value method are used for modelling. Finally, Matlab is used for numerical analysis. The results show that adding new retail channels to the traditional supply chain dominated by suppliers will increase the market share of products. The market share f is positively correlated with the profits of suppliers and retailers. Under the multi‐channel retail model, product demand and supply chain profits are higher than a single channel. The Shapley value method can enable supply chain members to obtain corresponding benefits on their input, thereby achieving a win–win situation for retail supply chain members.
... The aspects listed are all important to the operation of a retail store, but for the purpose of this paper, visual merchandising will be examined in more detail. The term visual merchandising refers to a set of policies, rules, practices, and procedures that are designed to optimize the placement of one or more products in a store [2]. This scenario includes planogram systems, tools that allow optimizing space management based on the available assortment, simplifying the consumer's buying process and thus increasing the retailer's profit. ...
Chapter
This work proposes a pipeline that aims to recognize the products in a shelf, at the level of the single SKU (Stock Keeping Unit), starting from a photo of that shelf. It is composed of a first neural network that detects the individual products on the shelf and has been trained with the SKU110K dataset and a second network, designed and built within this work that associates to the single image created by the first network, an embedding vector, which describes its distinctive features. By obtaining this vector of the input image, it is possible to measure the similarity, by means of the cosine similarity, between this vector and all the embedding vectors in the comparison dataset. The vector with the highest cosine similarity is associated to an image labeled with the EAN (European Article Number) code and, therefore, this EAN will be that of the input image. Given the particular task, there are not currently any dataset able to meet our requirements as they have not such a granular level of detail (EAN labeled), so a new properly designed dataset is created to solve this task.
Article
Full-text available
With the proliferation of online social networking services and mobile smart devices equipped with mobile communications module and position sensor module, massive amount of multimedia data has been collected, stored and shared. This trend has put forward higher request on massive multimedia data retrieval. In this paper, we investigate a novel spatial query named region of visual interests query (RoVIQ), which aims to search users containing geographical information and visual words. Three baseline methods are presented to introduce how to exploit existing techniques to address this problem. Then we propose the definition of this query and related notions at the first time. To improve the performance of query, we propose a novel spatial indexing structure called quadtree based inverted visual index which is a combination of quadtree, inverted index and visual words. Based on it, we design a efficient search algorithm named region of visual interests search to support RoVIQ. Experimental evaluations on real geo-image datasets demonstrate that our solution outperforms state-of-the-art method.
Article
Full-text available
Due to the advances in mobile computing and multimedia techniques, there are vast amount of multimedia data with geographical information collected in multifarious applications. In this paper, we propose a novel type of image search namedinteractive geo-tagged image search which aims to find out a set of images based on geographical proximity and similarity of visual content, as well as the preference of users. Existing approaches for spatial keyword query and geo-image query cannot address this problem effectively since they do not consider these three type of information together for query. In order to solve this challenge efficiently, we propose the definition of interactive top-k geo-tagged image query and then present a framework including candidate search stage , interaction stage and termination stage. To enhance the searching efficiency in a large-scale database, we propose the candidate search algorithm named GI-SUPER Search based on a new notion called superior relationship and GIR-Tree, a novel index structure. Furthermore, two candidate selection methods are proposed for learning the preferences of the user during the interaction. At last, the termination procedure and estimation procedure are introduced in brief. Experimental evaluation on real multimedia dataset demonstrates that our solution has a really high performance.
Article
Full-text available
Massive amount of multimedia data that contain times- tamps and geographical information are being generated at an unprecedented scale in many emerging applications such as photo sharing web site and social networks applications. Due to their importance, a large body of work has focused on efficiently computing various spatial image queries. In this paper,we study the spatial temporal image query which considers three important constraints during the search including time recency, spatial proximity and visual relevance. A novel index structure, namely Hierarchical Information Quadtree(HI-Quadtree), to efficiently insert/delete spatial temporal images with high arrive rates. Base on HI-Quadtree an efficient algorithm is developed to support spatial temporal image query. We show via extensive experimentation with real spatial databases clearly demonstrate the efficiency of our methods. © 2018 Springer Science+Business Media, LLC, part of Springer Nature
Article
Full-text available
With advances in multimedia technologies and the proliferation of smart phone, digital cameras, storage devices, there are a rapidly growing massive amount of multimedia data collected in many applications such as multimedia retrieval and management system, in which the data element is composed of text, image, video and audio. Consequently, the study of multimedia near duplicate detection has attracted significant concern from research organizations and commercial communities. Traditional solution minwish hashing (MinWise) faces two challenges: expensive preprocessing time and lower comparison speed. Thus, this work first introduce a hashing method called one permutation hashing (OPH) to shun the costly preprocessing time. Based on OPH, a more efficient strategy group based one permutation hashing (GOPH) is developed to deal with the high comparison time. Based on the fact that the similarity of most multimedia data is not very high, this work design an new hashing method namely hierarchical one permutation hashing (HOPH) to further improve the performance. Comprehensive experiments on real multimedia datasets clearly show that with similar accuracy HOPH is five to seven times faster than MinWise. © 2018 Springer Science+Business Media, LLC, part of Springer Nature
Article
Full-text available
For image retrieval methods based on bag of visual words, much attention has been paid to enhancing the discriminative powers of the local features. Although retrieved images are usually similar to a query in minutiae, they may be significantly different from a semantic perspective, which can be effectively distinguished by convolutional neural networks (CNN). Such images should not be considered as relevant pairs. To tackle this problem, we propose to construct a dynamic match kernel by adaptively calculating the matching thresholds between query and candidate images based on the pairwise distance among deep CNN features. In contrast to the typical static match kernel which is independent to the global appearance of retrieved images, the dynamic one leverages the semantical similarity as a constraint for determining the matches. Accordingly, we propose a semantic-constrained retrieval framework by incorporating the dynamic match kernel, which focuses on matched patches between relevant images and filters out the ones for irrelevant pairs. Furthermore, we demonstrate that the proposed kernel complements recent methods such as Hamming embedding, multiple assignment, local descriptors aggregation and graphbased re-ranking, while it outperforms the static one under various settings on off-the-shelf evaluation metrics. We also propose to evaluate the matched patches both quantitatively and qualitatively. Extensive experiments on five benchmark datasets and large-scale distractors validate the merits of the proposed method against the state-of-the-art methods for image retrieval.
Article
Full-text available
An end-to-end architecture for multi-script document retrieval using handwritten signatures is proposed in this paper. The user supplies a query signature sample, and the system exclusively returns a set of documents that contain the query signature. In the first stage, a component-wise classification technique separates the potential signature components from all other components. A bag-of-visual-words powered by SIFT descriptors in a patch-based framework is proposed to compute the features and a support vector machine (SVM)-based classifier was used to separate signatures from the documents. In the second stage, features from the foreground (i.e., signature strokes) and the background spatial information (i.e., background loops, reservoirs etc.) were combined to characterize the signature object to match with the query signature. Finally, three distance measures were used to match a query signature with the signature present in target documents for retrieval. The ‘Tobacco’ (The Legacy Tobacco Document Library (LTDL). University of California, San Francisco, 2007. http://legacy.library.ucsf.edu/) document database and an Indian script database containing 560 documents of Devanagari (Hindi) and Bangla scripts were used for the performance evaluation. The proposed system was also tested on noisy documents, and the promising results were obtained. A comparative study shows that the proposed method outperforms the state-of-the-art approaches.
Article
This paper aims to solve the problem of large-scale video retrieval by a query image. Firstly, we define the problem of top-k image to video query. Then, we combine the merits of convolutional neural networks(CNN for short) and Bag of Visual Word(BoVW for short) module to design a model for video frames information extraction and representation. In order to meet the requirements of large-scale video retrieval, we propose a visual weighted inverted index(VWII for short) and related algorithm to improve the efficiency and accuracy of retrieval process. Comprehensive experiments show that our proposed technique achieves substantial improvements (up to an order of magnitude speed up) over the state-of-the-art techniques with similar accuracy.
Article
Video-based person re-identification (re-id) is a central application in surveillance systems with significant concern in security. Matching persons across disjoint camera views in their video fragments is inherently challenging due to the large visual variations and uncontrolled frame rates. There are two steps crucial to person re-id, namely discriminative feature learning and metric learning. However, existing approaches consider the two steps independently, and they do not make full use of the temporal and spatial information in videos. In this paper, we propose a Siamese attention architecture that jointly learns spatio-temporal video representations and their similarity metrics. The network extracts local convolutional features from regions of each frame, and enhance their discriminative capability by focusing on distinct regions when measuring the similarity with another pedestrian video. The attention mechanism is embedded into spatial gated recurrent units to selectively propagate relevant features and memorize their spatial dependencies through the network. The model essentially learns which parts (where) from which frames (when) are relevant and distinctive for matching persons and attaches higher importance therein. The proposed Siamese model is end-to-end trainable to jointly learn comparable hidden representations for paired pedestrian videos and their similarity value. Extensive experiments on three benchmark datasets show the effectiveness of each component of the proposed deep network while outperforming state-of-the-art methods.
Conference Paper
This paper studies the set similarity join problem with overlap constraints which, given two collections of sets and a constant c, finds all the set pairs in the datasets that share at least c common elements. This is a fundamental operation in many fields, such as information retrieval, data mining, and machine learning. The time complexity of all existing methods is O(n2) where n is the total size of all the sets. In this paper, we present a size-aware algorithm with the time complexity of O(n2-over 1 c k1 over 2c)=o(n2)+O(k), where k is the number of results. The size-aware algorithm divides all the sets into small and large ones based on their sizes and processes them separately. We can use existing methods to process the large sets and focus on the small sets in this paper. We develop several optimization heuristics for the small sets to improve the practical performance significantly. As the size boundary between the small sets and the large sets is crucial to the efficiency, we propose an effective size boundary selection algorithm to judiciously choose an appropriate size boundary, which works very well in practice. Experimental results on real-world datasets show that our methods achieve high performance and outperform the state-of-the-art approaches by up to an order of magnitude.