Figure - uploaded by Madeleine Daepp
Content may be subject to copyright.
Sources of data for food outlet locations in the city of Vancouver, Canada

Sources of data for food outlet locations in the city of Vancouver, Canada

Source publication
Article
Full-text available
Objective The present study assessed systematic bias and the effects of data set error on the validity of food environment measures in two municipal and two commercial secondary data sets. Design Sensitivity, positive predictive value (PPV) and concordance were calculated by comparing two municipal and two commercial secondary data sets with groun...

Context in source publication

Context 1
... outlet data were obtained from five sources: (i) groundtruthed primary data; (ii) (municipal) Business Licences (36) ; (iii) (municipal) Vancouver Coastal Health inspections lists (37) ; (iv) (commercial) Pitney Bowes Software's Canada Business Points (38) ; and (v) (commercial) DMTI Spatial, Inc.'s Enhanced Points of Interest (39) . An overview of these data sets is provided in Table 2. ...

Similar publications

Article
Full-text available
Contextual determinants of health in Canada include community demographics, tobacco, food, and alcohol. We developed an on-line and highly accessible interactive map containing information from over 2000 communities across Canada.

Citations

... All variations captured across the municipalities were coded this way, with emerging codes and definitions compiled in a codebook by consensus between the two team members. Diagrams were developed to help conceptualize retailer categories and map interrelationships, informed by literature from Daepp and Black (31) and Prowse et al. (32) . Each retailer sub-category was then assigned a label of healthy/less healthy, guided by CDC criteria (9) . ...
... However, census tracts have been consistently used for Alberta's Nutrition Report Card and are stable units for annual comparisons. Finally, we relied on secondary datasets that were not ground-truthed (a resourceintensive process), which may inaccurately represent spatial accessibility by including retailers that have closed or neglecting some retailers altogether (31) . ...
... Our retailer sub-categories could be used in at least three additional ways by research, practice, and/or policy teams: 1) applying our coding beyond Alberta, since many of the retailer categories in our codebook exist elsewhere in North America; 2) extending the mRFEI by inductively developing unique sub-categories based on local data and/or excluded categories outlined in Table 1 (with transparent reporting of codebook details); and 3) developing new geospatial indicators based on indicated and other sub-categories. For example, Alberta's Nutrition Guidelines' (46) healthfulness labels for foods and beverages (choose most often, choose sometimes, choose least often) could be applied to retailer sub-categories to assess implementation of jurisdiction-specific nutrition policies, as has been done by team collaborators (11,31,47) . Using such guidelines could be one step towards moving past binary https://doi.org/10.1017/S1368980023000733 ...
Article
Full-text available
Objective: Limitations of traditional geospatial measures, like the modified Retail Food Environment Index (mRFEI), are well-documented. In response, we aimed to: 1) extend existing food environment measures by inductively developing sub-categories to increase the granularity of healthy versus less healthy food retailers; 2) establish replicable coding processes and procedures; and 3) demonstrate how a food retailer codebook and database can be used in healthy public policy advocacy. Design: We expanded the mRFEI measure such that "healthy" food retailers included grocery stores, supermarkets, hypermarkets, wholesalers, bulk food stores, produce outlets, butchers, delis, fish and seafood shops, juice/smoothie bars, and fresh and healthy quick-service retailers; and "less healthy" food retailers included fast-food restaurants, convenience stores, coffee shops, dollar stores, pharmacies, bubble tea restaurants, candy stores, frozen dessert restaurants, bakeries, and food trucks. Based on 2021 government food premise licenses, we used geographic information systems software to evaluate spatial accessibility of healthy and less healthy food retailers across census tracts and in proximity to schools, calculating differences between the traditional versus expanded mRFEI. Setting: Calgary and Edmonton, Canada. Participants: N/A. Results: Of the 10,828 food retailers geocoded, 26% were included using traditional mRFEI measures, while 53% were included using our expanded categorization. Changes in mean mRFEI across census tracts were minimal, but the healthfulness of food environments surrounding schools significantly decreased. Conclusions: Overall, we show how our mRFEI adaptation, and transparent reporting on its use, can promote more nuanced and comprehensive food environment assessments to better support local research, policy, and practice innovations.
... Considerable studies have been conducted on RFE, and the researchers come from a variety of fields, mainly including public health, health geography, and urban planning. They concentrated on the RFE neighborhood disparities [8] [9], the associations between RFE and diet-related outcomes [4] [5] [6] [10], food environment assessment methods [11] [12] and secondary data validity [9] [13] [14] [15] [16] [17]. These practical studies primarily from developed countries such as Canada [6] [11] [13], the United States [5] [17] [18], the United Kingdom [19] [20], and Spain [21], whereas developing countries contributed relatively limited experience and knowledge of RFE. ...
... They concentrated on the RFE neighborhood disparities [8] [9], the associations between RFE and diet-related outcomes [4] [5] [6] [10], food environment assessment methods [11] [12] and secondary data validity [9] [13] [14] [15] [16] [17]. These practical studies primarily from developed countries such as Canada [6] [11] [13], the United States [5] [17] [18], the United Kingdom [19] [20], and Spain [21], whereas developing countries contributed relatively limited experience and knowledge of RFE. There are significant disparities in dietary behaviors and food environments between countries with diverse cultural backgrounds, particularly in food culture. ...
... Studies of neighbourhood food environments typically rely on commercial or registry-based data systems. Previous work has focused on data quality and geographical biases in commercial data sources (Lebel et al. 2017; Daepp and Black 2017;Clary and Kestens 2013). Yet it is unclear whether community science food outlet data are a reliable alternative to costly commercial datasets, which can be used by academics, practitioners, and policy makers to understand food environments in real time. ...
Article
Full-text available
We conducted a case study to assess the validity of community science (Yelp, OpenStreetMaps) and commercial (DMTI) food outlet datasets. We compared counts of food outlets from 13 street segments in Vancouver and Montreal to Google Street View. We found that OpenStreetMaps correctly identified the most outlets in both cities and DMTI consistency overcounted outlets. In Vancouver, we assessed validity by outlet type, again OpenStreetMap performed the best overall but largely missed grocery stores, and Yelp did not include convenience stores. Results provide insights into using different commercial and open-source datasets to measure food environments.
... Consistent with previous studies that utilized health licensing and permits databases to enumerate restaurants and food retailers within a specified location [9,19], we identified publicly available datasets from local and state public health agencies to enumerate the baseline sample of restaurants and food retailers in NYC. For restaurants, we used the NYC Department of Health and Mental Hygiene (DOHMH) Restaurant Inspections database [20]. ...
... Staff then searched for other available evidence with a calendar date in the form of reliable customer reviews, website information, online photos, or reliable news articles. Dates of evidence were recorded, and food outlets were classified as open pre-COVID- 19 if they were open prior to 2020. If food outlets were found to be closed in 2019, they were marked as closed pre-COVID-19. ...
Article
Full-text available
Background COVID-19 mitigation strategies have had an untold effect on food retail stores and restaurants. Early evidence from New York City (NYC) indicated that these strategies, among decreased travel from China and increased fears of viral transmission and xenophobia, were leading to mass closures of businesses in Manhattan’s Chinatown. The constantly evolving COVID −19 crisis has caused research design and methodology to fundamentally shift, requiring adaptable strategies to address emerging and existing public health problems such as food security that may result from closures of food outlets. Objective We describe innovative approaches used to evaluate changes to the food retail environment amidst the constraints of the pandemic in an urban center heavily burdened by COVID-19. Included are challenges faced, lessons learned and future opportunities. Methods First, we identified six diverse neighborhoods in NYC: two lower-resourced, two higher-resourced, and two Chinese ethnic enclaves. We then developed a census of food outlets in these six neighborhoods using state and local licensing databases. To ascertain the status (open vs. closed) of outlets pre-pandemic, we employed a manual web-scraping technique. We used a similar method to determine the status of outlets during the pandemic. Two independent online sources were required to confirm the status of outlets. If two sources could not confirm the status, we conducted phone call checks and/or in-person visits. Results The final baseline database included 2585 food outlets across six neighborhoods. Ascertaining the status of food outlets was more difficult in lower-resourced neighborhoods and Chinese ethnic enclaves compared to higher-resourced areas. Higher-resourced neighborhoods required fewer phone call and in-person checks for both restaurants and food retailers than other neighborhoods. Conclusions Our multi-step data collection approach maximized safety and efficiency while minimizing cost and resources. Challenges in remote data collection varied by neighborhood and may reflect the different resources or social capital of the communities; understanding neighborhood-specific constraints prior to data collection may streamline the process.
... These covariates are education and median family income used in the sale description models and the proportion of immigrants calculated from the 2011 Canadian National Household Survey (35) . The covariates also include the availability of recreational facilities encouraging physical exercise (the number of facilities per resident) obtained from the Canada Business Point data, which are annually updated business enumeration data and validated previously for the accuracy of business locations (36,37) . The availability of recreational facilities was calculated as the number of facilities divided by 1000 residents for each neighbourhood, where recreational facilities were defined based on the Standard Industry Classification codes as previously described (38) . ...
Article
Full-text available
Objective Geographic measurement of diets is generally not available at areas smaller than a national or provincial (state) scale, as existing nutrition surveys cannot achieve sample sizes needed for an acceptable statistical precision for small geographic units such as city subdivisions. Design Using geocoded Nielsen grocery transaction data collected from supermarket, supercentre and pharmacy chains combined with a gravity model that transforms store-level sales into area-level purchasing, we developed small-area public health indicators of food purchasing for neighbourhood districts. We generated the area-level indicators measuring per-resident purchasing quantity for soda, diet-soda, flavoured (sugar-added) yogurt, and plain yogurt purchasing. We then provided an illustrative public health application of these indicators as covariates for an ecological spatial regression model to estimate spatially correlated small-area risk of type 2 diabetes mellitus (T2D) obtained from the public health administrative data. Setting Greater Montreal, Canada in 2012 Participants Neighbourhood districts (n=193). Results The indicator of flavoured yogurt had a positive association with neighbourhood-level risk of T2D (1.08, 95% Credible Interval [CI]: 1.02-1.14), while that of plain yogurt had a negative association (0.93, 95% CI: 0.89-0.96). The indicator of soda had an inconclusive association, and that of diet soda was excluded due to collinearity with soda. The addition of the indicators also improved model fit of the T2D spatial regression (Watanabe-Akaike information criterion = 1,765 with the indicators, 1,772 without). Conclusion Store-level grocery sales data can be used to reveal micro-scale geographic disparities and trends of food selections that would be masked by traditional survey-based estimation.
... We also excluded food retail outlets inside institutions (e.g. schools and universities) because they are not freely available to the general public and so do not constitute part of the community food environment, but rather the organizational food environment [3,11]. ...
... For the validity analysis of GE data, we used the ground-truth data as the gold standard for comparison. Following previous studies, the algorithm matched each food retail outlet according to name and geographic location [11]. A food retailer was considered a true positive (TP) if it was listed in both GE and the ground-truth data, a false positive (FP) if it was listed in GE but not in the ground-truth data and a false negative (FN) if it was listed in the ground-truth data but not in the GE. ...
... Contrary to our initial hypothesis, the present study did not find a clear relationship between social vulnerability and validity of GE. This finding is similar to the results of other studies using secondary data [11,13,14]. In the 2017 meta-analysis of the validity of commercial business data, seven of the nine studies that examined neighbourhood socioeconomic status showed that there were no significant differences in validity across neighbourhoods [19]. ...
Article
Full-text available
To overcome the challenge of obtaining accurate data on community food retail, we developed an innovative tool to automatically capture food retail data from Google Earth (GE). The proposed method is relevant to non-commercial use or scholarly purposes. We aimed to test the validity of web sources data for the assessment of community food retail environment by comparison to ground-truth observations (gold standard). A secondary aim was to test whether validity differs by type of food outlet and socioeconomic status (SES). The study area included a sample of 300 census tracts stratified by SES in two of the largest cities in Brazil, Rio de Janeiro and Belo Horizonte. The GE web service was used to develop a tool for automatic acquisition of food retail data through the generation of a regular grid of points. To test its validity, this data was compared with the ground-truth data. Compared to the 856 outlets identified in 285 census tracts by the ground-truth method, the GE interface identified 731 outlets. In both cities, the GE interface scored moderate to excellent compared to the ground-truth data across all of the validity measures: sensitivity, specificity, positive predictive value, negative predictive value and accuracy (ranging from 66.3 to 100%). The validity did not differ by SES strata. Supermarkets, convenience stores and restaurants yielded better results than other store types. To our knowledge, this research is the first to investigate using GE as a tool to capture community food retail data. Our results suggest that the GE interface could be used to measure the community food environment. Validity was satisfactory for different SES areas and types of outlets.
... These studies have applied conventional epidemiological diagnostic measures such as sensitivity and positive-predictive value (PPV) (12) to assess the accuracy of a CAB dataset (10) . Previous validation studies in the North American context have demonstrated a wide range of agreement between CAB and other datasets (10,13,14) . While some studies have reported high levels of agreement between commercial and governmental datasets in urban centres (10) , others have indicated that government datasets are less error-prone and may be better for specific food environment measures (14) . ...
... Previous validation studies in the North American context have demonstrated a wide range of agreement between CAB and other datasets (10,13,14) . While some studies have reported high levels of agreement between commercial and governmental datasets in urban centres (10) , others have indicated that government datasets are less error-prone and may be better for specific food environment measures (14) . As the literature has grown, data accuracy in rural contexts has increasingly been studied (10,11) . ...
... These cross-tabulations were repeated for the data after stratification by store-type, ownership and rurality. Stratification is a commonly used method to assess differences in exposures employed in food environment studies (14) . Rural-urban classifications as well as store type and size are well-described potential stratifiers in the rural and regional context (38) , and ownership (chain/independent) is an important emerging attribute given food industry consolidation (39) . ...
Article
Objective Commercially available business (CAB) datasets for food environments have been investigated for error in large urban contexts and some rural areas, but there is a relative dearth of literature that reports error across regions of variable rurality. The objective of the current study was to assess the validity of a CAB dataset using a government dataset at the provincial scale. Design A ground-truthed dataset provided by the government of Newfoundland and Labrador (NL) was used to assess a popular commercial dataset. Concordance, sensitivity, positive-predictive value (PPV) and geocoding errors were calculated. Measures were stratified by store types and rurality to investigate any association between these variables and database accuracy. Setting NL, Canada. Participants The current analysis used store-level (ecological) data. Results Of 1125 stores, there were 380 stores that existed in both datasets and were considered true-positive stores. The mean positional error between a ground-truthed and test point was 17·72 km. When compared with the provincial dataset of businesses, grocery stores had the greatest agreement, sensitivity = 0·64, PPV = 0·60 and concordance = 0·45. Gas stations had the least agreement, sensitivity = 0·26, PPV = 0·32 and concordance = 0·17. Only 4 % of commercial data points in rural areas matched every criterion examined. Conclusions The commercial dataset exhibits a low level of agreement with the ground-truthed provincial data. Particularly retailers in rural areas or belonging to the gas station category suffered from misclassification and/or geocoding errors. Taken together, the commercial dataset is differentially representative of the ground-truthed reality based on store-type and rurality/urbanity.
... To match outlets in the administrative dataset with outlets found during ground-truthing, we employed two strategies: (1) Location matching (henceforth referred to as liberal matching), and (2) location and name matching (henceforth referred to as strict matching) [45][46][47][48]. ...
... While this idea has been suggested by previous studies [47,48], our study extends previous research because we examined the validity of secondary data sources in a European context and included multiple types of food outlets (e.g., specialized food retailers) [49,50]. This is important, because previous studies have found that independent (non-chain) food outlets are more likely to be missed in commercial datasets [31][32][33]51]. ...
... To evaluate potential differential measurement error, previous studies also examined validity statistics by area characteristics, which could lead to confounding in the association between area-level characteristics (e.g., socioeconomic status) and food environment measures [18,48]. In this sense, we found fewer differences between area-level socioeconomic status, population density and the proportion of outlets that were correctly matched in the secondary data by using location (liberal) matching. ...
Article
Full-text available
Previous studies have suggested that European settings face unique food environment issues; however, retail food environments (RFE) outside Anglo-Saxon contexts remain understudied. We assessed the completeness and accuracy of an administrative dataset against ground truthing, using the example of Madrid (Spain). Further, we tested whether its completeness differed by its area-level socioeconomic status (SES) and population density. First, we collected data on the RFE through the ground truthing of 42 census tracts. Second, we retrieved data on the RFE from an administrative dataset covering the entire city (n = 2412 census tracts), and matched outlets using location matching and location/name matching. Third, we validated the administrative dataset against the gold standard of ground truthing. Using location matching, the administrative dataset had a high sensitivity (0.95; [95% CI = 0.89, 0.98]) and positive predictive values (PPV) (0.79; [95% CI = 0.70, 0.85]), while these values were substantially lower using location/name matching (0.55 and 0.45, respectively). Accuracy was slightly higher using location/name matching (k = 0.71 vs 0.62). We found some evidence for systematic differences in PPV by area-level SES using location matching, and in both sensitivity and PPV by population density using location/name matching. Administrative datasets may offer a reliable and cost-effective source to measure retail food access; however, their accuracy needs to be evaluated before using them for research purposes.
... Although POS data were not available for the out-of-sample stores, their locations and chain names were available from Canadian Business Points (Pitney Bowes Canada, Mississauga, Canada), a commercial business registry that provides an annually updated (including 2012) list of operating businesses including the name, location, and store type as defined by the North American Industry Classification code (19), as well as business size as the number of employees (20). A field validation showed a strong correlation of business points listed in Canadian Business Points with manually verified existing supermarkets, convenience stores, and fast-food restaurants (21). ...
Article
Measurement of neighborhood dietary patterns at high spatial resolution allows public health agencies to identify and monitor communities with an elevated risk of nutrition-related chronic diseases. Currently, data on diet are obtained primarily through nutrition surveys, which produce measurements at low spatial resolutions. The availability of store-level grocery transaction data provides an opportunity to refine the measurement of neighborhood dietary patterns. We used these data to develop an indicator of area-level latent demand for soda in the Census Metropolitan Area of Montreal in 2012 by applying a hierarchical Bayesian spatial model to data on soda sales from 1,097 chain retail food outlets. The utility of the indicator of latent soda demand was evaluated by assessing its association with the neighborhood relative risk of prevalent type 2 diabetes mellitus. The indicator improved the fit of the disease-mapping model (deviance information criterion: 2,140 with the indicator and 2,148 without) and enables a novel approach to nutrition surveillance.
Article
Full-text available
Numerous research methodologies have been used to examine food environments. Existing reviews synthesizing food environment measures have examined a limited number of domains or settings and none have specifically targeted Canada. This rapid review aimed to 1) map research methodologies and measures that have been used to assess food environments; 2) examine what food environment dimensions and equity related-factors have been assessed; and 3) identify research gaps and priorities to guide future research. A systematic search of primary articles evaluating the Canadian food environment in a real-world setting was conducted. Publications in English or French published in peer-reviewed journals between January 1 2010 and June 17 2021 and indexed in Web of Science, CAB Abstracts and Ovid MEDLINE were considered. The search strategy adapted an internationally-adopted food environment monitoring framework covering 7 domains (Food Marketing; Labelling; Prices; Provision; Composition; Retail; and Trade and Investment). The final sample included 220 articles. Overall, Trade and Investment (1%, n = 2), Labelling (7%, n = 15) and, to a lesser extent, Prices (14%, n = 30) were the least studied domains in Canada. Among Provision articles, healthcare (2%, n = 1) settings were underrepresented compared to school (67%, n = 28) and recreation and sport (24%, n = 10) settings, as was the food service industry (14%, n = 6) compared to grocery stores (86%, n = 36) in the Composition domain. The study identified a vast selection of measures employed in Canada overall and within single domains. Equity-related factors were only examined in half of articles (n = 108), mostly related to Retail (n = 81). A number of gaps remain that prevent a holistic and systems-level analysis of food environments in Canada. As Canada continues to implement policies to improve the quality of food environments in order to improve dietary patterns, targeted research to address identified gaps and harmonize methods across studies will help evaluate policy impact over time.