Edgar Santos-Fernández

Edgar Santos-Fernández
Queensland University of Technology | QUT · School of Mathematical Sciences

PhD Statistics, Eng (BS Industrial Engineering)

About

58
Publications
33,111
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
367
Citations
Introduction
I am a Senior Research Fellow in Data Science at the School of Mathematical Sciences, Queensland University of Technology. My research focuses on applied statistics and Bayesian modeling, with a particular interest in leveraging data science techniques to solve real-world problems.

Publications

Publications (58)
Article
Background Water is the lifeblood of river networks, and its quality plays a crucial role in sustaining both aquatic ecosystems and human societies. Real-time monitoring of water quality is increasingly reliant on in-situ sensor technology. Anomaly detection is crucial for identifying erroneous patterns in sensor data, but can be a challenging task...
Preprint
Full-text available
Data collected from arrays of sensors are essential for informed decision-making in various systems. However, the presence of anomalies can compromise the accuracy and reliability of insights drawn from the collected data or information obtained via statistical analysis. This study aims to develop a robust Bayesian optimal experimental design (BOED...
Article
Full-text available
Optimal design facilitates intelligent data collection. In this paper, we introduce a fully Bayesian design approach for spatial processes with complex covariance structures, like those typically exhibited in natural ecosystems. Coordinate exchange algorithms are commonly used to find optimal design points. However, collecting data at specific poin...
Article
Full-text available
Various methods have been developed to combine inference across multiple sets of results for unsupervised clustering, within the ensemble clustering literature. The approach of reporting results from one 'best' model out of several candidate clustering models generally ignores the uncertainty that arises from model selection, and results in inferen...
Article
Background: Water is the lifeblood of river networks, and its quality plays a crucial role in sustaining both aquatic ecosystems and human societies. Real-time monitoring of water quality is increasingly reliant on in-situ sensor technology. Anomaly detection is crucial for identifying erroneous patterns in sensor data, but can be a challenging tas...
Preprint
Full-text available
The use of in-situ digital sensors for water quality monitoring is becoming increasingly common worldwide. While these sensors provide near real-time data for science, the data are prone to technical anomalies that can undermine the trustworthiness of the data and the accuracy of statistical inferences, particularly in spatial and temporal analyses...
Article
Full-text available
Two Sustainable Development Goals are focused directly on combating the impacts of climate change on coral reef communities: Goal 13, Climate Action (Take urgent action to combat climate change and its impacts) and Goal 14, Life Below Water (Conserve and sustainably use the oceans, seas and marine resources for sustainable development). Citizen sci...
Preprint
Full-text available
Spatio-temporal models are widely used in many research areas from ecology to epidemiology. However, a limited number of computational tools are available to model river network datasets in space and time. In this paper, we introduce the R package SSNbayes for fitting Bayesian spatio-temporal models and making predictions on branching stream networ...
Article
Full-text available
We develop a novel global perspective of the complexity of the relationships between three COVID-19 datasets, the standardised per-capita growth rate of COVID-19 cases and deaths, and the Oxford Coronavirus Government Response Tracker COVID-19 Stringency Index (CSI) which is a measure describing a country's stringency of lockdown policies. We use a...
Article
Crowdsourcing methods facilitate the production of scientific information by non‐experts. This form of citizen science (CS) is becoming a key source of complementary data in many fields to inform data‐driven decisions and study challenging problems. However, concerns about the validity of these data often constrain their utility. In this paper, we...
Preprint
Full-text available
Time series often reflect variation associated with other related variables. Controlling for the effect of these variables is useful when modeling or analysing the time series. We introduce a novel approach to normalize time series data conditional on a set of covariates. We do this by modeling the conditional mean and the conditional variance of t...
Preprint
Full-text available
Crowdsourcing methods facilitate the production of scientific information by non-experts. This form of citizen science (CS) is becoming a key source of complementary data in many fields to inform data-driven decisions and study challenging problems. However, concerns about the validity of these data often constrain their utility. In this paper, we...
Preprint
Full-text available
Water is the lifeblood of river networks, and its quality plays a crucial role in sustaining both aquatic ecosystems and human societies. Real-time monitoring of water quality is increasingly reliant on in-situ sensor technology. Anomaly detection is crucial for identifying erroneous patterns in sensor data, but can be a challenging task due to the...
Article
Full-text available
Building on a strong foundation of philosophy, theory, methods and computation over the past three decades, Bayesian approaches are now an integral part of the toolkit for most statisticians and data scientists. Whether they are dedicated Bayesians or opportunistic users, applied professionals can now reap many of the benefits afforded by the Bayes...
Preprint
Two Sustainable Development Goals are focused directly on combating the impacts of climate change on coral reef communities. These are: Goal 13 “Take urgent action to combat climate change and its impacts” and Goal 14 “Conserve and sustainably use the oceans, seas and marine resources for sustainable development”. Citizen science (CS) features prom...
Preprint
Full-text available
Building on a strong foundation of philosophy, theory, methods and computation over the past three decades, Bayesian approaches are now an integral part of the toolkit for most statisticians and data scientists. Whether they are dedicated Bayesians or opportunistic users, applied professionals can now reap many of the benefits afforded by the Bayes...
Preprint
Full-text available
Various methods have been developed to combine inference across multiple sets of results for unsupervised clustering, within the ensemble and consensus clustering literature. The approach of reporting results from one `best' model out of several candidate clustering models generally ignores the uncertainty that arises from model selection, and resu...
Article
Full-text available
Introduction To better understand the relationships between neurophysiology, cognitive function and psychopathology risk in adolescence there is value in identifying data-driven subgroups based on measurements of brain activity and function, and then comparing cognition and mental health between such subgroups. Methods We developed a flexible and...
Preprint
Full-text available
Optimal design facilitates intelligent data collection. In this paper, we introduce a fully Bayesian design approach for spatial processes with complex covariance structures, like those typically exhibited in natural ecosystems. Coordinate Exchange algorithms are commonly used to find optimal design points. However, collecting data at specific poin...
Article
Full-text available
Following the introduction of high-resolution player tracking technology, a new range of statistical analysis has emerged in sports, specifically in basketball. However, such high-dimensional data are often challenging for statistical inference and decision making. In this article we employ a state-of-the-art Bayesian mixture model that allows the...
Preprint
Full-text available
This paper aims to develop a global perspective of the complexity of the relationship between the standardised per-capita growth rate of Covid-19 cases, deaths, and the OxCGRT Covid-19 Stringency Index, a measure describing a country's stringency of lockdown policies. To achieve our goal, we use a heterogeneous intrinsic dimension estimator impleme...
Article
Full-text available
Spatio-temporal models are widely used in many research areas including ecology. The recent proliferation of the use of in-situ sensors in streams and rivers supports space-time water quality modelling and monitoring in near real-time. A new family of spatio-temporal models is introduced. These models incorporate spatial dependence using stream dis...
Article
Full-text available
Study Objectives To investigate the proportion of children in Aotearoa New Zealand (NZ) who do or do not meet sleep duration and sleep quality guidelines at 24 and 45 months of age and associated sociodemographic factors. Methods Participants were children (n=6,490) from the Growing Up in New Zealand longitudinal study of child development with sl...
Preprint
Full-text available
Introduction To better understand the relationships between brain activity, cognitive function and mental health risk in adolescence there is value in identifying data-driven subgroups based on measurements of brain activity and function, and then comparing cognition and mental health symptoms between such subgroups. Methods Here we implement a mu...
Article
Full-text available
Virtual reality (VR) technology is an emerging tool that is supporting the connection between conservation research and public engagement with environmental issues. The use of VR in ecology consists of interviewing diverse groups of people while they are immersed within a virtual ecosystem to produce better information than more traditional surveys...
Article
Full-text available
Citizen science projects have become increasingly popular in many fields, including ecology. However, the quality of this information is frequently debated within the scientific community. Modern citizen science implementations therefore require measures of the users' proficiency. We introduce a new methodological framework of item response that qu...
Preprint
Full-text available
Crowdsourcing methods allow the production of scientific information by non-experts and is becoming a key tool to address complex challenges in ecological research. In some cases the participants of crowdsourcing programs are familiar with the ecological species or categories involved in the task, whereas in many others substantial training and qua...
Preprint
Full-text available
Spatio-temporal models are widely used in many research areas including ecology. The recent proliferation of the use of in-situ sensors in streams and rivers supports space-time water quality modelling and monitoring in near real-time. In this paper, we introduce a new family of dynamic spatio-temporal models, in which spatial dependence is establi...
Article
Full-text available
Many research domains use data elicited from ‘citizen scientists’ when a direct measure of a process is expensive or infeasible. However, participants may report incorrect estimates or classifications due to their lack of skill. We demonstrate how Bayesian hierarchical models can be used to learn about latent variables of interest, while accounting...
Preprint
Full-text available
Many research domains use data elicited from “citizen scientists” when a direct measure of a process is expensive or infeasible. However, participants may report incorrect estimates or classifications due to their lack of skill. We demonstrate how Bayesian hierarchical models can be used to learn about latent variables of interest, while accounting...
Article
Introduction Previously, combined data analyses of four pilot fatigue monitoring studies including 237 pilots flying long-haul and ultra-long range (ULR) flights found no association between pilots’ actigraphic sleep in flight and psychomotor vigilance task (PVT) performance at top-of-descent (TOD; beginning of the landing phase of flight). The pre...
Preprint
Full-text available
So-called ``citizen science'' data elicited from crowds has become increasingly popular in many fields including ecology. However, the quality of this information is being frequently debated by many within the scientific community. Therefore, modern citizen science implementations require measures of the users' proficiency that account for the diff...
Article
Background Multiple aspects of nurses’ rosters interact to affect the quality of patient care they can provide and their own health, safety and wellbeing. Objectives 1) Develop and test a matrix incorporating multiple aspects of rosters and recovery sleep that are individually associated with three fatigue-related outcomes - fatigue-related clinic...
Preprint
Full-text available
A new range of statistical analysis has emerged in sports after the introduction of the high-resolution player tracking technology, specifically in basketball. However, this high dimensional data is often challenging for statistical inference and decision making. In this article, we employ Hidalgo, a state-of-the-art Bayesian mixture model that all...
Article
Full-text available
Numerous organisations collect data in the Great Barrier Reef (GBR), but they are rarely analysed together due to different program objectives, methods, and data quality. We developed a weighted spatio-temporal Bayesian model and used it to integrate image-based hard-coral data collected by professional and citizen scientists, who captured and/or c...
Article
Background: Fatigue resulting from shift work and extended hours can compromise patient care and the safety and health of nurses, as well as increasing nursing turnover and health care costs. Objectives: This research aimed to identify aspects of nurses’ work patterns associated with increased risk of reporting fatigue-related outcomes. Design: A...
Article
Full-text available
Bayesian methods are becoming increasingly popular in sports analytics. Identified advantages of the Bayesian approach include the ability to model complex problems, obtain probabilistic estimates and predictions that account for uncertainty, combine information sources and update learning as new data become available. The volume and variety of dat...
Chapter
Full-text available
Bayesian techniques are being quickly adopted in sports settings. The volume and variety of data produced in sports activities over the past years and the availability of software packages for Bayesian computation have contributed positively to this growth. This article provides a brief review of the latest advances in Bayesian statistics in sports...
Article
Introduction: Airlines are required to monitor the effectiveness of their pilot fatigue risk management. The present survey sought the views of all pilots at Delta Air Lines on fatigue-related issues raised by their colleagues participating in regular airline safety audits. Methods: All 13,217 pilots from 9 aircraft fleets were invited to partic...
Preprint
Data in the Great Barrier Reef (GBR) are collected by numerous organisations and rarely analysed together. We developed a weighted spatio-temporal Bayesian model that integrate datasets, while accounting for differences in method and quality, which we fit to image-based, hard-coral data collected by professional and citizen scientists. Citizens pro...
Preprint
Full-text available
Data in the Great Barrier Reef (GBR) are collected by numerous organisations and rarely analysed together. We developed a weighted spatiotemporal Bayesian model that integrate datasets, while accounting for differences in method and quality, which we fit to image based, hard coral data collected by professional and citizen scientists. Citizens prov...
Poster
Full-text available
Actigraphy is a cost-effective and convenient tool for activity-based monitoring. It allows studying sleep/wake patterns and identifying disorders in sleep research. ActisoftR was designed for parsing actigraphy outputs and to summarise scored data across user-defined intervals. It consists of several functions for importing, generating reports and...
Article
Full-text available
Sampling inspection plans are used in the food industry to determine whether a batch of food is contaminated or not. Testing for pathogens is mandatory in several foodstuffs because some bacteria pose a significant risk to human health, even when these are consumed in minute quantity. Test performance measures such as sensitivity and specificity ar...
Thesis
Full-text available
Acceptance sampling plays a crucial role in food quality assurance. However, safety inspection represents a substantial economic burden due to the testing costs and the number of quality characteristics involved. This thesis presents six pieces of work on the design of attribute and variables sampling inspection plans for food safety and quality. S...
Article
Full-text available
Sampling inspection plans are principally used to determine whether a batch of food is contaminated or not. In this theoretical research, we study the effect of increasing the analytical unit amount on the performance of microbiological sampling plans, and on the resulting quality after inspection. We discuss several scenarios of homogeneous and in...
Article
Full-text available
The design of attribute sampling inspection plans based on compressed or narrow limits for food safety applications is covered. Artificially compressed limits allow a significant reduction in the number of analytical tests to be carried out while maintaining the risks at predefined levels. The design of optimal sampling plans is discussed for two g...
Article
Full-text available
Testing composite samples is a useful strategy to achieve sampling economy. Several studies have shown the effectiveness of this technique under the assumption of perfect mixing of primary samples. This paper investigates the effect of imperfect composite sample preparation on the performance of two and three-class variables sampling inspection pla...
Article
Full-text available
Variables sampling plans for microbial safety are usually based on the log transformation of the observed counts. We propose a new variables plan for lognormal data using the angular transformation. In a comparison with the classic approach, this new method shows more stringency and allows the use of smaller sample sizes to obtain the same level of...
Book
Full-text available
The intensive use of automatic data acquisition system and the use of cloud computing for process monitoring have led to an increased occurrence of industrial processes that utilize statistical process control and capability analysis. These analyses are performed almost exclusively with multivariate methodologies. The aim of this Brief is to presen...
Chapter
As a general rule, normality and independence of the data is required in Statistical Process Control and the multivariate extensions are not the exception. Different tools are presented—graphical methods, marginal and multivariate normality test, solutions to the departures from normality, and a randomness test.
Chapter
In this chapter the most recognized multivariate process capability indices are presented. The first section approaches the computation of these indices in R, and the next ones are dedicated to the indices based on ratios of the volume tolerance region to a process region such as Taam et al. (J Appl Stat 20:339–351, 1993), Shahriari et al. (Proceed...
Chapter
This chapter is devoted to the main aspects concerning the multivariate control charts. It covers the multivariate normal distribution, the data structure, and the mult.chart function that allows the computation in R and the most used multivariate control charts such as the χ2, T2, the Multivariate Exponentially Weighted Moving Average, the Multiva...
Article
Full-text available
Manufacturing processes are often based on more than one quality characteristic. When these variables are correlated the process capability analysis should be performed using multivariate statistical methodologies. Although there is a growing interest in methods for evaluating the capability of multivariate processes, little attention has been give...

Network

Cited By