ArticlePDF Available

Green IoT based Technology for Sustainable Smart Cities

Authors:
SPECIAL SESSION ON (Session Number: 25)
Green Engineering and Technology: A prospective of Sustainable
Development.
Green IoT based Technology for Sustainable Smart Cities
Dr. Parthasarathi Pattnayak1
School of Computer applications
KIIT Deemed to be University
Odisha, India
Email: parthafca@kiit.ac.in
Om Prakash Jena2*
Department of Computer Science,
Ravenshaw University,
Odisha, India,
Email: jena.omprakash@gmail.com
Abstract- Environmental sustainability is a widely discussed subject around the
world. This paper discusses the much important role of Internet of Things (IoT) as
an integral part of ICT infrastructure for sustainable smart cities. It makes the
smart cities a greener place by identifying pollution through environmental
sensors. Around the world the government and various public and private
organizations are making individual as well as collective efforts to reduce the
energy consumption and carbon production and recommending Green IoT (G-IoT)
for smart cities. Extant literature on smart cities related architectures are very much
present for some time. But this paper discusses the concept and utility of the G-IoT
to create a green environment with an idea of energy saving in smart cities.
Moreover, this paper proposes the design of a G-IoT configuration that clearly
focuses on the reduction of the energy consumption and more particularly limiting
the energy usage to achieve the objective of sustainable green smart cities. We have
shown that our proposed G-IoT configuration that completely depends on cloud
based system ultimately reduces the hardware consumption.
Keywords: Green IoT, Sustainable Smart City, Green Cloud Computing Sensors, Big Data
Applications, G-IoT Configuration.
2
1 Introduction
The innovation of IoT and related to large information pertain are plainly on a
penetrative way over the frameworks furthermore, spaces of keen maintainable
urban communities [9]. This is showed in the expansion and spread of different
stages of the hidden center empowering innovations for gathering, preparing, and
breaking down monster measures of metropolitan information relating to the
climate over a few urban areas banging or recovering them as keen maintainable.
This information originate from various sources, for example, landuse designs,
spatial associations, natural elements, transport and traffic frameworks, versatility
what's more, travel conduct, normal biological systems, energy assets, building
mechanization, foundations and offices, etc. Considering this, the advancement of
keen practical urban areas dependent on the IoT and related enormous information
investigation is progressively turning into an unmistakable possibility. In fact,
Green IoT put together keen urban communities center with respect to green plan,
manufacturing, tasks, green maintenance and even green reuse with a little effect
on the climate. These green shrewd urban areas incorporates green keen aviation
and flight frameworks, green brilliant homes, green savvy structures, green keen
e-wellbeing, green brilliant coordination, green shrewd retail, green flexibly chain
the board, green brilliant transportation, green keen reusing, green keen climate
checking and so forth with negligible energy is used. The green IoT (hereafter
G-IoT) keen urban areas objects like cell phones, PCs, vehicles, and electronic
machines can speak with one another with particular locations in energy sparing
mode. These tactile gadgets can convey astutely through web convention and can
offer green help in overseeing various undertakings for the clients. It can uphold
different advances [9] for example recognizable proof, correspondence,
information and sign handling advances with decreased energy utilization. The
advancement of G-IoT based shrewd urban areas procedures of energy productive
is expected to receive. Discernment manageable urban communities regularly
depend on the satisfaction of different ICT dreams of unavoidable registering [11],
most outstandingly the IoT [12], where ordinary items speak with one another and
team up across heterogeneous and dispersed processing conditions to give data and
administrations to metropolitan elements. Proclaiming a better mechanical action
described of all timedeveloping embededness of ICT into metropolitan
frameworks furthermore, areas, the IoT as a socially troublesome innovation is
extended to bring about a radical change of the technometropolitan environment
in the entirety of its multifaceted nature and assortment. This may thus change how
ICT can be applied what's more, utilized in all metropolitan circles with broad
ecological ramifications. It has in truth been recommended that as ICT gets
unavoidable, for example saturate frameworks, offices, assets, building plans,
biological system administrations, authoritative administrations, and residents'
items, we can talk about urban areas getting more intelligent as to tending to
ecological issues[7]. Altogether, the extension of the IoT as a figuring worldview
and related huge information examination pattern is progressively invigorating
brilliant practical city activities and projects inside biologically and mechanically
3
progressed countries [4].Sensor innovation empowering the advancements of the
IoT. There exist huge scopes of the IoT structures that basically plan to give the
fitting framework to the activity of IoT biological system according to enormous
applications, for example, brilliant manageable urban communities. Commonly,
they incorporate different, various kinds of sensors, notwithstanding information
handling frameworks, remote correspondence networks, and actuators through
which the frameworks demonstration in the physical climate[6].The sensors are
fundamentally used to gather the enormous masses of metropolitan information
that fill in as contributions for huge information use. Sensor innovation is
subsequently a key component of information preparing as a lot of computational
and logical functionalities related with the IoT biological system with regards to
shrewd economical urban areas.
The outline of the rest of the paper is as follows. In section two, a discussion on
big data and IoT applications has been carried out. Section three focuses on G-IoT
and cloud based architecture. Analytical frame-work is discussed in section four.
Section five concludes the paper with future prospective of the work.
2 Big data and IoT applications: Smart sustainable cities versus smart cities
Research on the IoT and related large information applications has been
dynamic in the domain of brilliant urban communities, managing generally with
monetary development and the personal satisfaction. However, the IoT and
related to enormous information utilised in progressing ecological supportability
with regards to brilliant reasonable urban areas as a comprehensive metropolitan
improvement approach is scarcely investigated to date. Thusly, another
examination wave has begun to zero in on the best way to upgrade shrewd city
approaches too as economical city models by consolidating the two metropolitan
improvement procedures trying to accomplish the required degree of natural
supportability through improving metropolitan tasks, capacities, plans, and
administrations utilizing progressed ICT[14]. This integrated metropolitan
advancement approach stresses the utilization of large information examination as
a lot of cutting edge strategies, measures, stages, frameworks, and applications,
notwithstanding other progressed types of ICT like setting mindful figuring.
Specifically, the advancing information driven methodology supposedly holds
incredible potential to address the test of natural supportability under what is
marked 'brilliant reasonable urban areas' of things to come [1]. The route forward
for future urban areas to progress ecological supportability is through progressed
ICT that guarantees the usage of enormous information examination [3].
4
3 G-IoT and Cloud-Based Architecture
Here, the proposed G-IoT framework for smart cities will attend correspondence,
normalization, quality aspects. The principle include of this proposed engineering
is that it depends on the cloud stage which automatically diminishes the utilization
energy for some frameworks and makes the environment clean. There are five
layers in the proposed G-IoT i.e., Sensor layer and Smart City infrastructure,
Network layer, Big data analytic layer, Application layer and Presentation layer.
This framework characterizes the fundamental correspondence ideal models for
the associating elements. It gives a reference correspondence stack alongside
understanding about the fundamental associations around the model. This depicts
the approach of correspondence plans which can be applied to various kinds of
G-IoT organizations.This is significant that different networks of sensors in
various sorts of networks can speak with one another.
3.1 Sensor Layer and Smart City Infrastructure; In Smart cities different
kinds of sensors installed and operating in different systems with minimal power
consumption which is supports by this layer. Sensors Networks (WSN), crowd
sourcing, RFID are the sensing framework in this layer. The labeled articles can be
identifying through RFID (automatic identification technique). These inactive
RFID labels are not battery worked. The power can take from the per user’s
transmission sign to the RFID reader by impart ID. In supply chain management
this kind of framework can be valuable for Smart cities.WSN plays an important
role in urban sensing utilization. It is a doable answer for the applications
identified with transportation also, access control which will gather measure and
investigate the significant data accumulated from an assortment of conditions.
The remote sensors are littler in size, less expensive, more astute and far and wide
(e.g., implanted camera).As the long range interpersonal communication is
blasting another sort of detecting worldview for example savvy telephone
innovation has advanced by empowering the residents of the keen urban areas to
contribute towards the brilliant city the executives. It assumes a significant
function in government resident communication. So‚ this layer must have the
option to help gigantic volume of IoT information created by remote sensors and
brilliant gadgets. IoT sensors are accumulated with different sorts of conventions
and heterogeneous organizations utilizing various advances. IoT networks should
be adaptable to productively serve a wide scope of administrations and
applications over huge scope organizations.
3.2 Network Layer: In order to accept ability across networks, higher
communication layer, the network, and WSN preferably use common protocols in
the lower communication layers. At low cost, low power consumption and short
distance communication, the standard IEEE 802.5.14 defines the link and physical
layer for smart cities.The other correspondence advances like Wireless Hart, Zig
Bee, WIA-PA and ISA.100.11 a relying on their separations to convey [5].The
5
mapping of IPv6 on IEEE 802.15.4 e.g. RFC 6282 defines by theIPv6 over
Low-Rate Wireless Area Net- work (6LoWPAN) working group.The proposed
framework is masking extra recurrence groups for example Television blank area,
territorial groups which work at ultra-low energy for various utilizations like train
control. Bluetooth is likewise a remote short range convention. Bluetooth 4.0
receives an innovation Bluetooth 4.0 is a low energy convention furthermore,
lightweight variation for low force applications. The fundamental prerequisites of
these correspondence advances are the force utilization and little computational
impressions for remote sensor organization so IP convention suite is the principle
contender for these layers. Indeed, even the already explicit principles who
characterized their own convention can be moved to IP. So the WSN and IoT
IPv6 is the attainable answer for brilliant urban areas utilizations.
3.3 Analytic Big Data Layer: Periodic and Aperiodic are the two types of data
management and information flow layer[13].In intermittent information the
executives IoT sensor information requires sifting since the information is
gathered intermittently and some information may not be required so this
information should be sifted through. In periodic information the board, the
information is an occasion set off IoT sensor information which may require
quick conveyance and reaction for model health related crisis sensor information.
In this proposed engineering large information power through ventures Iota and
explanatory devices. The G-Iota correspondence advances, organizations and
administrations movements ought to have the option to help dynamic climate
through web engineering advancement, conventions and remote framework
access models and developed security protection. In this layer the G-IoT cloud
stage and cloud measure the executives for energy effectiveness and improvement
for the application layer. Indeed, even cloud can be isolated to mist to spare more
energy. It likewise control the administration administrations like data
investigation, security control, measure displaying and gadget control to G-IoT
cloud stage furthermore, cloud measure the board. It is likewise liable for an
operational help framework, security, business rule the board, business measure
the executives. It has to offer support examination stage, for example, measurable
investigation, information mining, and text mining, prescient examination and so
on.
3.4 Application Layer: In G-IoT through different communication techniques,
this layer set most noteworthy purpose among the stack is in charge of transport
of various utilization to various clients. In Green IoT through different
communication techniques, this layer set most noteworthy purpose among the
stack is in charge of transport of various utilization to various clients. Through
fuzzy recognition, cloud computing and other technologies analyze the massive
data and information. In Fig.1the smart cities utilizations can be for the public and
private sectors, user and administration. In this layer all the natural climate
correspondence are a part like comprehensive monitoring of energy,water
6
resources monitoring management, monitoring environment protection, smart air
pollution monitoring,supply consumption monitoring, water quality diagnostics
monitoring, key pollution source and automobile exhaustt. Based on these new
services increasing efficiencies of urban management, real time physical world
data, addressing environmental degradation and improving infrastructure
integrity.
3.5 Presentation Layer: In this layer the information gets from application
layer.Data can be imparted in various organizations by means of various sources.
Subsequently, the introduction layer is liable for coordinating all configurations
into a norm design for productive and powerful correspondence. The introduction
layer follows data programming structure plans created for various dialects and
gives the continuous grammar required for correspondence between two articles
such as layers, frameworks or organizations. The information organization ought
to be satisfactory by the following layers; in any case, the introduction layer may
not perform effectively. Different city frameworks like water flexibly framework,
power gracefully framework, contamination control framework, transport division
and so on can share their data by utilizing web-based interfaces, web, versatile
uses that are based on this layer. Individuals and government department could
get particular information as per their requirements through this layer which can
be utilized in the services of the city.
4. Analytical framework
4.1 Domains and Systems of Urban: These should work and be overseen utilizing
ICT of inescapable registering, in particular the IoT and its fundamental enormous
information examination as a lot of trend setting innovations together with their
novel utilisation. These ought to preferably be joined with the typologies
furthermore, plan ideas of feasible metropolitan structures [8]. Typologies
incorporate minimization, thickness, variety, and blended land use as typologies
related to manageable vehicle, greening, and detached sunlight based plan as plan
ideas. These typologies and plan ideas establish key procedures to accomplish the
necessary degree of supportability with regards to practical metropolitan structures.
These metropolitan segments are to be upheld by elevated requirements of natural
and metropolitan administration he thought is that shrewd sustainable urban
communities ought to beas types of arranging standards and plan ideas of
maintainabilitychecked, comprehended, dissected, and intended to improve their
commitment to the objective of environmentally maintainable advancement based
on profoundly intelligent and inventive arrangements. Metropolitan frameworks
and areas establish the primary wellspring of metropolitan information, which are
created by different metropolitan substances regarding the physical resources
related with the IoT, including city specialists, metropolitan offices, metropolitan
administrators, singular residents, and privately owned businesses. They provide
heterogeneous and epic measures of information as contributions for enormous
7
information applications empowered through the IoT. Metropolitan information in
their assortment, scale, and speed are constantly labelled with spatial and transient
names, generally spilled from different tactile sources and put away in information
bases, created regularly and consequently, and incorporated what's more, mixed in
information distribution centres as use at the city-wide scale. Subsequently, this
segment includes distinctive sectorial and cross-sectorial wellsprings of
metropolitan information of changed kinds and sizes that are to be collected, put
away, and recovered for later preparing, examination, visualization, sending, and
sharing all through the instructive scene so as to help metropolitan activities,
capacities, plans, what's more, administrations with regards to ecological
manageability.
4.2 Data categories, big data sources and storage facilities in Urban:
Metropolitan large information sources, storerooms, and information classes. This
segment is dedicated to information assortment, stockpiling, and the executives. It
includes information storehouses, information stockrooms, and storehouses of
public information. For example, warehousing as a major information investigation
method utilized in the metropolitan area involves combination of information from
a few information bases, which thusly are kept up by different metropolitan units
alongside verifiable and outline data. Database administration systems are utilized
to keep up metropolitan information of huge scope and various classifications.
Likewise, cloud-based capacity can be completely virtualizedPC produced
variant of storeroom, and all gadgets are totally straightforward to the metropolitan
components as clients of the cloud that can interface with the distributed storage
through the organization. The additional benefit of joining cloud capacity with
clever pressure strategies lies in, notwithstanding altogether decreasing capacity
costs, giving the chance of effectively putting away a wide range of huge
information having a place with the spaces of brilliant practical urban areas.
4.3 Cloud Computing or Fog/Edge computing: This segment is devoted to the
cycle of information disclosure/ information mining. The sub-measures identified
with information disclosure encompass choice, prepossessing, change, mining,
interpretation, and assessment [8]. As to information mining, the sub-measures
included incorporate information understanding, information planning,
demonstrating, assessment, and organization [10].The discovered or separated
information includes insight capacities, and result from information handling and
the executives carry out by Hadoop MapReduce dependent on distributed
computing. Such capacities are expected for dynamic, choice help, and choice
robotization. Insight capacities are utilized for constant furthermore, key choices,
contingent upon the application area traffic frameworks versus energy
frameworks), as far as control, automation, advancement, and the executives.
4.4 Big Data Applications: This part involves the assorted information centric
applications empowered by the IoT related with ecological manageability
8
comparable to assorted metropolitan spaces. One application typically includes a
few arrangements relating to various sub-domains of every area, contingent upon
the kind of the natural supportability issue that will be fathomed [2].To put it
differently, information driven applications include framework conduct and
administration conveyance. At the centre of this part is the result of the execution of
improvement procedures and activity taking cycles. Along these lines, it executes
activities and offers types of assistance as per the choice taken which is dependent
on the extricated helpful information from the IoT information. Fig.2 shows the
work of huge information investigation utilizing the centre empowering
innovations on the cloud base IoT in the setting of brilliant reasonable urban
communion.
Fig.1 The IoT to advance environmental sustainability in the context of smart sustainable cities
9
5 Conclusion
The proposed cloud administrations what's more, visual specialized
instruments utilizing fast broadband correspondence networks in savvy urban
communities can improve business in corporate and government divisions
moreover. In the interim, sensor networks using assortment of remote
advancements in green brilliant urban areas offer admittance to data on the
progression of merchandise and the status of hardware and the climate. They
additionally encourage the utilization of controller. This makes conceivable the
execution of brilliant urban areas presuming sheltered, secure, and
environmentally cognizant. Sensor layer gives the all G-IoT through WSN to the
clients for utilizing diverse application through cloud stage and cycle. In future,
collaboration between networks can be energized as sensors and actuators,
correspondence innovations and control frameworks are getting more crude and
wise. Empowered by the IoT as a type of inescapable registering, large
information applications are progressively getting perpetually imperative to keen
maintainable urban areas as for their operational working and wanting to improve
their commitment to the objectives of naturally maintainable turn of events.
References:
[1] Bates, D.W., Saria, S., Ohno-Machado, L., Shah, A., Escobar, G.: Big
data in health care: using analytics to identify and manage high-risk and high-cost
patients. Health Aff.vol. 33, pp.1123-1131,( 2014).
[2] Ding P., Li F., Casual inference: A missing data perspective. Statistical
Science., vol.33(2): pp.214-237,( 2018).
[3] Escobar, G.J., Puopolo, K.M., Wi, S., Turk, B.J., Kuzniewicz, M.W.,
Walsh, E.M., et al.: Stratification of risk of early-onset sepsis in newborns $ 34
weeks’ gestation., Pediatrics 133, pp.30-36,(2014).
[4] Goldstein, B.A., Navar, A.M., Pencina, M.J., Ioannidis, J.P.:
Opportunities and challenges in developing risk prediction models with electronic
health records data: a systematic review. J. Am. Med. Inform. Assoc., vol. 27 (1),
pp.198-208, (2016).
[5] Greenland S., Robins JM., Pearl J.: Confounding and collapsibility in
causal inference. Statistical science., pp. 29 46,(1999).
10
[6] Jung, K., Covington, S., Sen, C.K., Januszyk, M., Kirsner, R.S., Gurtner,
G.C., et al.: Rapid identification of slow healing wounds. Wound Repair
Regen., vol.24, pp.181-188,(2016).
[7] Kayyali B., Knott D., Kuiken S.: The Big Data Revolution in US Health
Care: Accelerating value and innovation. McKinsey & Company.
[8] Leek, J.T., Peng, R.D.: Statistics. What is the question? Science 347,
pp.1314-1315,(2015).
[9] Mason, E., Jain, S., Kendall, M., Mostashari, F., Blumenthal, D.: The
regional extension center program: helping physicians meaningfully use health
information technology. Ann. Intern. Med., 153, pp. 666-670, (2010).
[10] Marlin BM., Zemel RS., Roweis ST., Slaney M.: Recommender systems,
missing data and statistical model estimation.In: International Joint Conference on
Artificial Intelligence (IJCAI)., vol. 22, pp. 2686,(2011).
[11] Mossialos, E., Wenzl, M., Osborn, R., Sarnak, D.: 2015 International
Profiles of Health Care Systems. The Commonwealth Fund., (2016).
[12] Parikh, R.B., Kakad, M., Bates, D.W.: Integrating predictive analytics
into high-value care: the dawn of precision delivery. JAMA 315. pp. 651-652,
(2016).
[13] Pearl J.: Theoretical Impediments to Machine Learning With Seven
Sparks from the Causal Revolution. arXiv preprint arXiv:180104016.,( 2018).
[14] Pencina, M.J., Peterson, E.D.: Moving from clinical trials to precision
medicine: the role for predictive modeling., JAMA 315, pp.1713-1714, (2016).
Chapter
The Internet of Things (IoT), which provides selected sub-categories of data with free access to a wealth of digital services, is able to seamlessly incorporate a wide range of heterogeneous end systems. Smart City is a vision aimed at integrating people residing in the cities with services that are essential and affect everyday life. Smart parking within all streets with multiple information and communications solutions is one such example of essential service. The Internet of Things (IoT) is a new and unique step forward to manage efficiently and effectively the parking system through its ability of smart compute intelligent parking systems. The most important reason for using IoT for parking is to collect the data on vehicle occupancy and use free parking spaces effectively. This paper proposes a prototype IoT-based real-time smart street parking system, with easy accessibility to appropriate data and provide solution to the people to locate free parking spot easily and effectively.
Article
Full-text available
Inferring causal effects of treatments is a central goal in many disciplines. The potential outcomes framework is a main statistical approach to causal inference, in which a causal effect is defined as a comparison of the potential outcomes of the same units under different treatment conditions. Because for each unit at most one of the potential outcomes is observed and the rest are missing, causal inference is inherently a missing data problem. Indeed, there is a close analogy in the terminology and the inferential framework between causal inference and missing data. Despite the intrinsic connection between the two subjects, statistical analyses of causal inference and missing data also have marked differences in aims, settings and methods. This article provides a systematic review of causal inference from the missing data perspective. Focusing on ignorable treatment assignment mechanisms, we discuss a wide range of causal inference methods that have analogues in missing data analysis, such as imputation, inverse probability weighting and doubly-robust methods. Under each of the three modes of inference--Frequentist, Bayesian, and Fisherian randomization--we present the general structure of inference for both finite-sample and super-population estimands, and illustrate via specific examples. We identify open questions to motivate more research to bridge the two fields.
Article
Full-text available
The US health care system is rapidly adopting electronic health records, which will dramatically increase the quantity of clinical data that are available electronically. Simultaneously, rapid progress has been made in clinical analytics-techniques for analyzing large quantities of data and gleaning new insights from that analysis-which is part of what is known as big data. As a result, there are unprecedented opportunities to use big data to reduce the costs of health care in the United States. We present six use cases-that is, key examples-where some of the clearest opportunities exist to reduce costs through the use of big data: high-cost patients, readmissions, triage, decompensation (when a patient's condition worsens), adverse events, and treatment optimization for diseases affecting multiple organ systems. We discuss the types of insights that are likely to emerge from clinical analytics, the types of data needed to obtain such insights, and the infrastructure-analytics, algorithms, registries, assessment scores, monitoring devices, and so forth-that organizations will need to perform the necessary analyses and to implement changes that will improve care while reducing costs. Our findings have policy implications for regulatory oversight, ways to address privacy concerns, and the support of research on analytics.
Article
Full-text available
Objective: To define a quantitative stratification algorithm for the risk of early-onset sepsis (EOS) in newborns ≥ 34 weeks' gestation. Methods: We conducted a retrospective nested case-control study that used split validation. Data collected on each infant included sepsis risk at birth based on objective maternal factors, demographics, specific clinical milestones, and vital signs during the first 24 hours after birth. Using a combination of recursive partitioning and logistic regression, we developed a risk classification scheme for EOS on the derivation dataset. This scheme was then applied to the validation dataset. Results: Using a base population of 608,014 live births ≥ 34 weeks' gestation at 14 hospitals between 1993 and 2007, we identified all 350 EOS cases <72 hours of age and frequency matched them by hospital and year of birth to 1063 controls. Using maternal and neonatal data, we defined a risk stratification scheme that divided the neonatal population into 3 groups: treat empirically (4.1% of all live births, 60.8% of all EOS cases, sepsis incidence of 8.4/1000 live births), observe and evaluate (11.1% of births, 23.4% of cases, 1.2/1000), and continued observation (84.8% of births, 15.7% of cases, incidence 0.11/1000). Conclusions: It is possible to combine objective maternal data with evolving objective neonatal clinical findings to define more efficient strategies for the evaluation and treatment of EOS in term and late preterm infants. Judicious application of our scheme could result in decreased antibiotic treatment in 80,000 to 240,000 US newborns each year.
Article
Objective: Electronic health records (EHRs) are an increasingly common data source for clinical risk prediction, presenting both unique analytic opportunities and challenges. We sought to evaluate the current state of EHR based risk prediction modeling through a systematic review of clinical prediction studies using EHR data. Methods: We searched PubMed for articles that reported on the use of an EHR to develop a risk prediction model from 2009 to 2014. Articles were extracted by two reviewers, and we abstracted information on study design, use of EHR data, model building, and performance from each publication and supplementary documentation. Results: We identified 107 articles from 15 different countries. Studies were generally very large (median sample size = 26 100) and utilized a diverse array of predictors. Most used validation techniques (n = 94 of 107) and reported model coefficients for reproducibility (n = 83). However, studies did not fully leverage the breadth of EHR data, as they uncommonly used longitudinal information (n = 37) and employed relatively few predictor variables (median = 27 variables). Less than half of the studies were multicenter (n = 50) and only 26 performed validation across sites. Many studies did not fully address biases of EHR data such as missing data or loss to follow-up. Average c-statistics for different outcomes were: mortality (0.84), clinical prediction (0.83), hospitalization (0.71), and service utilization (0.71). Conclusions: EHR data present both opportunities and challenges for clinical risk prediction. There is room for improvement in designing such studies.
Article
Evidence-based care has been defined as “the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients.”1 However, until now, most treatments have been designed with a “one-size-fits-all” approach: useful for some patients but not helpful or even harmful for others.2 Analyses of clinical trials generally focus on summarizing overall average treatment effects without more deliberate investigation of which patients actually benefit. For example, if the number needed to treat using a new therapy is 50, then 50 individuals need to receive this treatment for 1 individual to benefit. But what characterizes that benefiting individual? Therapies can also both help and harm, successfully improving some outcomes while also placing patients at increased risk for other adverse events.
Article
This Viewpoint discusses the use of electronic health record “big data” to integrate predictive analytics into clinical practice and future directions for using predictive analytics to achieve high-value health care.United States health care costs are twice as high as spending in most industrialized countries. One key opportunity for health systems to improve value is by limiting overuse of costly resources, in part by focusing these resources toward high-risk patient groups.1 Some health systems have been using retrospective claims data or other approaches, like the Framingham risk model, to identify high-risk individuals. However, most systems today are doing little in the way of risk stratification, and physicians often find it difficult to apply these characterizations of risk to the care of an individual patient.
Article
Chronic nonhealing wounds have a prevalence of 2% in the United States, and cost an estimated $50 billion annually. Accurate stratification of wounds for risk of slow healing may help guide treatment and referral decisions. We have applied modern machine learning methods and feature engineering to develop a predictive model for delayed wound healing that uses information collected during routine care in outpatient wound care centers. Patient and wound data was collected at 68 outpatient wound care centers operated by Healogics Inc. in 26 states between 2009 and 2013. The dataset included basic demographic information on 59,953 patients, as well as both quantitative and categorical information on 180,696 wounds. Wounds were split into training and test sets by randomly assigning patients to training and test sets. Wounds were considered delayed with respect to healing time if they took more than 15 weeks to heal after presentation at a wound care center. Eleven percent of wounds in this dataset met this criterion. Prognostic models were developed on training data available in the first week of care to predict delayed healing wounds. A held out subset of the training set was used for model selection, and the final model was evaluated on the test set to evaluate discriminative power and calibration. The model achieved an area under the curve of 0.842 (95% confidence interval 0.834-0.847) for the delayed healing outcome and a Brier reliability score of 0.00018. Early, accurate prediction of delayed healing wounds can improve patient care by allowing clinicians to increase the aggressiveness of intervention in patients most at risk.
Article
Mistaking the type of question being considered is the most common error in data analysis. Copyright © 2015, American Association for the Advancement of Science.
Conference Paper
The goal of rating-based recommender systems is to make personalized predictions and recommendations for individual users by leveraging the preferences of a community of users with respect to a collection of items like songs or movies. Recommender systems are often based on intricate statistical models that are estimated from data sets containing a very high proportion of missing ratings. This work describes evidence of a basic incompatibility between the properties of recommender system data sets and the assumptions required for valid estimation and evaluation of statistical models in the presence of missing data. We discuss the implications of this problem and describe extended modelling and evaluation frameworks that attempt to circumvent it. We present prediction and ranking results showing that models developed and tested under these extended frameworks can significantly outperform standard models.