Conference PaperPDF Available

Weather-based solar energy prediction

Authors:

Abstract and Figures

Photovoltaic solar panels are effective energy sources during periods of bright sunlight. Excess energy can be stored for later use at night or on cloudy days. The decision to use the stored energy now or later depends largely on being able to predict the weather on different timescales. Short term prediction of stored energy is challenging due to the non-trivial I-V characteristic of the solar cell. The erratic nature of the weather makes long term predictive energy management difficult. In this paper, we address these issues based on data collected from a solar panel, as well as its relationship to observations made of the weather. We observe that prediction, based on fuzzy decision trees, reduces the energy error by 22% compared to a constant prediction equal to the average on the studied period. Thus, exploiting the fuzzy classification provided by a fuzzy decision tree is a good improvement compared to the baseline.
Content may be subject to copyright.
Weather-Based Solar Energy Prediction
Marcin Detyniecki
Université Pierre et
Marie Curie
LIP6 UPMC CNRS
4 place Jussieu, 75005
Paris, France
Marcin.Detyniecki@lip6.fr
Christophe Marsala
Université Pierre et
Marie Curie
LIP6 UPMC
4 place Jussieu, 75005
Paris, France
Christophe.Marsala@lip6.fr
Ashwati Krishnan
Dept of Elect. & Comp .
Engineering (ECE)
Carnegie Mellon
University
Pittsburgh PA USA
ashwatik@andrew.cmu.edu
Mel Siegel
The Robotics Institute
Carnegie Mellon
University
Pittsburgh PA USA
mws@cmu.edu
AbstractPhotovoltaic solar panels are effective energy sources
during periods of bright sunlight. Excess energy can be stored for
later use at night or on cloudy days. The decision to use the
stored energy now or later depends largely on being able to
predict the weather on different timescales. Short term
prediction of stored energy is challenging due to the non-trivial I-
V characteristic of the solar cell. The erratic nature of the
weather makes long term predictive energy management
difficult. In this paper, we address these issues based on data
collected from a s olar panel, as well as its relations hip to
observations made of the weather. We observe that prediction,
based on fuzzy decision trees, reduces the energy error by 22%
compared to a cons tant prediction equal to the aver age on the
studied period. Thus, exploiting the fuzzy classification provided
by a fuzzy decision tree is a good improvement compared to the
baseline.
Keywords: solar energy; photovoltaic; power utilization
planning; weather; energy prediction; fuzzy decision trees.
I. INTRODUCTION
The availability of solar energy is not guaranteed at any
particular place or time: it depends, of course, on time-of-day,
but also on the weather conditions that prevail and that
prevailed recently. Since meteorological agencies provide
detailed weather forecasts round-the-clock, we should be able
to use their predictions to our advantage in planning activities
that require solar energy. Three interesting questions are
apparent: (1) given the standard weather forecasts available
today, can we reliably predict the energy we will be able to
capture tomorrow? (2) given our own measured actual
insolation and other local weather conditions right now, to
what extent can we make that prediction? and (3) what is the
optimal approach to fusion of these two prediction sources?
The answers to these questions are not only of academic
interest but also of crucial practical importance [6].
Contemporary solar panels are series-arrays of silicon
photovoltaic cells that are essentially large-area silicon pn-
junctions. Incident optical photons promote electrons from the
valence to the conduction band. The band-gap voltage across
the junction capacitance thus has the potential to drive DC
current through an external load. The cell's open-circuit voltage
is essentially the band-gap. Its short-circuit current depends -
not necessarily simply - on the incident optical power. Their
ratio is the internal impedance to which an external load must
be matched to achieve maximum energy transfer to the load.
Optimally extracting short-term power and long-term
energy is thus a complicated business that requires active real-
time control intelligently based on knowledge of present
requirements and an ability to predict and plan for future
requirements [7], [8].
Presenting how a fuzzy prediction method, and in particular
the Fuzzy Decision Trees (FDTs) can improve energy
prediction accuracy, prediction is this paper's main goal. We
have chosen, for this early attempt in estimating the energy
gain on real conditions, to use FDTs, in contrast to other
approaches such as neuronal networks or other regression
techniques, because FDTs produce human understandable rules
that will allow us, in the future, to improve the system. In fact,
not only relevant variables are automatically indentified, but
also their interaction is identified. Moreover FDTs have the
advantage to be able to handle simultaneously symbolic (here
weather classes such as cloudy, sunny, thunderstorm) and
numerical ones (such as temperature).
In order to achieve real conditions we used a standard solar
panel for home use, described in Section II. We placed the
panel in real conditions and collected I-V data with a dedicated
electronic apparatus and weather conditions and forecast from
the national service over the Internet, as presented in Section
III. In the following section we briefly present the Fuzzy
Decision Trees and how training and testing was performed.
Section V and VI are dedicated, respectively, to data and
results analysis.
II. SOLA R PAN ELS
Solar cells are connected in series to build solar modules or
panels. Panels generally consist of 28 to 36 cells in series to
produce 12VDC under defined illumination conditions. An
ideal solar panel current-voltage (I-V) curve is shown in Figure
1.1. For any real panel there is a continuous family of these
curves wherein open-circuit voltage increases with illumination
level and current-droop increases with decreasing illumination.
Thus optimum transfer of solar power to an external load
requires matching the load impedance to the illumination level.
Figure 1.2 shows a family of I-V curves for our solar panel
collected during 5-hour period when insolation was changing.
Notice especially the variations in curve scale and shape, and,
based on the teaching of Figure 1.1, the consequent variation of
available power and optimum load to extract it.
Figure 1.1: Ideal solar power panel. Isc is the short circuit current (when
load resistance RL=0) and VOC is the open circuit voltage (when RL = ∞).
The black curve is the I-V characteristic, the gray curve is the power
available to an external load (the IV product), and the dashed-blue load
line finds the operating point (V,I) - marked by the red cross - at which the
load will extract maximum possible power from the panel.
A. Getting the maximum power out of a panel (MPPT)
Consider a system where the load is connected directly
across the solar panel. Its maximum power point (MPP) is the
point on the I-V curve where the area under the curve is
maximum, as shown in Figure 1.1. For optimal simplicity and
efficiency one should choose a solar panel that perfectly
matches the intended load. But this is not possible: the I-V
curve - hence the MPP - changes with illumination. It also
changes with panel temperature, which also depends in part on
illumination. Active measuring and switching power
converters, called maximum power point trackers (MPPT), can
switch the load so as to keep the operating point at the MPP.
Seve ral solutions, in particular based on fuzzy control, have
been proposed [11] and are still under investigation [9]. A
complete comparison can be found in [10].
On the one hand this is simple, on the other hand it is
daunting. If all we want to do is, say, toast bread, then it is easy
enough to switch the resistance of the heating element; the
toasting time changes with illumination level, but within
reasonable limits we still make toast. But for the vast majority
of practical loads - "appliances" - it is impossible to flexibly
and efficiently trade off voltage rating and current demand. We
thus anticipate a critical near-future demand for active power
converters that will accommodate a plausible range of
fluctuating DC input voltages and deliver stable standard DC or
AC output voltages without incurring unacceptable losses [12].
Note that the control algorithm required for MPPT is non-
trivial. The MPP is not known a priori, and it moves with
variations in illumination and temperature. In practice perturb-
and-observe (P&O) algorithms are employed [10], despite the
objection-in-principle that when the system is actually
optimized any perturbation is guaranteed to reduce efficiency.
Clearly the scale of the integral term in the control algorithm is
crucial, and should itself be dynamic, as the system needs on
the one hand to respond rapidly to fast changes in illumination
level, e.g., passing clouds, and on the other hand it must not
spend too much efficiency hunting when conditions are
changing only slowly, e.g., on cloudless days.
Figure 1.2: I-V of our solar panel at four times on a typical day. Load
resistors between 5 and 155 ohms in 5 ohm steps are switched in random
order across the panel, voltage across the load is measured and current is
calculated from voltage and resistance. Open-circuit voltage is also
measured and recorded as the zero-current value. A diagrammatic
representation of the setup is shown and explained in Figure 3.1.
III. SOLAR ENERGY UNDER REAL WEATHER CONDITIONS
The National Renewable Energy Laboratory recommends
that solar panels be characterized under standard test conditions
(STC): temperature 25 C and illumination 1000 W/m2 (1.0 sun)
with an air mass 1.5 (AM1.5) filtered solar spectrum. The idea
is to match the illumination and spectrum of sunlight incident
on a clear day on a sun-facing 37°-tilted surface with the sun at
an angle of 41.81° above the horizon. This condition - with the
panel aimed directly at the sun - geo metrically approximates
solar noon near the spring and autumn equinoxes in the
continental United States. However insolation at the earth's
surface is rarely as large as the prescribed 1000 W/m2. And, as
already noted, to realistically study electrical energy generation
under realistic weather conditions, realistic fluctuations in
lighting and temperature must be observed. Note also that a
panel that is optimal in the NREL environment is almost
certainly suboptimal in any natural environment. So to study
solar energy production with practical goals under natural
weather conditions it is advisable to combine the solar panel
with an MPPT. But there are many such commercial devices,
each one running some undisclosed proprietary algorithm, none
of them arguably best or even in any sense standard. Thus we
elect to organize our measurements in a way that allows us to
simulate an ideal MPPT algorithm that is collect all possible
data first and compute after the fact the real optimal point.
Figure 3.1: Hardware diagram. From left to right, solar panel, serial load resistor array in series with disconnect-relay (to measure panel's open-circuit
voltage), transistor-buff ere d resistor -shorting relays, voltage divider, Arduino (performing analog voltage input measurement and controlling load-re ist or -
shorting relays)
.
A. Instrumentation
We studied the response of our solar panel - approximately
32 cm x 60 cm, so approximately 0.19 m2 - using a simple
single-board data acquisition system in communication with a
dedicated laptop computer that is in turn in communication
with the internet. Our panel is an off-the-shelf unit mounted at
a tilt-angle of approximately 40o outside an approximately
south-facing window with a reasonably clear view of the sun's
path most of the day, most of the year. The panel's pointing and
tilting are probably never perfectly optimal, but are a good
compromise that receives better-than-average solar radiation
throughout the year. Data acquisition and control are provided
by an Arduino Duemilanouve (2009) [3], a low-cost easy-to-
program open-design board that provides convenient access to
the ATMega168 microcontroller's digital I/O, 10-bit analog
input, PWM output, and serial communication pins. A program
written in a C-like language using a simple API on a PC is
mor e-or-less invisibly compiled and downloaded via a USB
channel on which data are subsequently also returned. Digital
output pins are transistor-buffered and diode-protected to safely
switch the coils of relays that short-out a series-array of {5, 10,
20, 40, 80} ohm power resistors to provide 5 to 155 ohm load
in 5 ohm steps - plus open-circuit - across the solar panel. A
measurement sequence is initiated and recorded every 10
minutes. Independently but also every 10 minutes, a USB
webcam captures a sky picture. The data files and sky pictures
are stored "in the cloud" using Dropbox [2]. As a practical
matter, the ATMega168's ADC's rudimentary analog input
circuitry and 10-bit resolution do not provide precise or
accurate measurements. But they do appear to be stable, which
is all that is really required for the present experiments,
wherein we are interested primarily in reaching qualitative
conclusions. Of course, since the measurements do seem to be
stable, after-the-fact calibration can be undertaken if
subsequently it seems valuable.
B. Weather Forecast
The solar panel and its Arduino-plus -Windows-laptop
based monitoring system are located at an off-campus location,
which is secure and has a good south-looking view with a large
open-sky solid-angle. On campus another Windows PC that has
reliable access to the Internet periodically downloads present
and predicted weather information from the National Oceanic
and Atmospheric Administration (NOAA) through the Yahoo!
Weather RSS Feed [1], in the form of XML files. Since,
weather conditions tend to vary slowly, we recorded the
weather conditions every hour, every day. In order to be able to
match forecast with current condition, we used the 48 standard
categories provided by the weather service. To mini mize the
prediction error, we choose here to use the forecast just before
sunrise. Other more complicated methods could take into
account the evolution or tendency of the forecast.
C. Data aggregation
The question of how to aggregate the data may seem simple
at first sight, but it is in fact extremely complex. We choose to
work on a one-full-day basis, because it provides a natural,
regular cycle. Further works could deal with energy prediction
with a shorter or a longer time horizon. Hence to compute the
energy produced by the panel over one day we need to start
from the power measurement obtained every 10 minutes. Our
first step consists of choosing from each of these series the
maximum power. In this way we simulate an ideal MPPT.
Then under the assumption that everything remains equal for
the following ten minutes we integrate over the whole day to
obtain the total energy produced. The assumption introduces an
error for quickly changing conditions (as for instance a sunny
day with some clouds). In fact, the measurement could have
been done when the cloud is just over the panel. We believe
that the introduced error averages out because of the frequency
and the uniform nature of the sampling. In fact, if there are a lot
of clouds, more often than not the measurement will be done
under reduced illumination approximately proportional to the
average coverage. Since weather conditions fluctuate during
the day, to obtain a global “for the day” weather classification,
we choose to aggregate by majority vote all the classifications
of the National Weather Service reported during the daylight
hours of that day. In other words, we choose to label the day
based on the most frequent NWS classification; and we focus
our attention only on the hours when there should be light
(between sunrise and sunset). So, if it rains for only one hour
during the day and it was, for the rest, a sunny day, it is labeled
as a sunny day (notice that this is not the case for weather
services). Although the solar panel data, the sky pictures, and
the downloaded weather data are not perfectly synchronized,
for the purpose and nature of the experiments described their
imprecise - and occasionally inconsistent - alignment is
inconsequential.
IV. ENERGY AND WEATHER PRED ICT IO N
Based on the data described above the challenge is to
predict, before the sun rises, the energy that we will by
produced during that day. All methods can be grouped in two
large families: The direct ones, where the energy value is
computed by a “black-box” algorithm (usually regression like,
as for instance Neuronal Networks [7] [8]) and the indirect
ones, where first a weather class is predicted and based on it an
average electricity is predicted.
In this paper, we choose to explore the performance of the
latter. This approach allows using the power of the national
weather forecast services, without any further modifications, to
predict the energy. The general formulation has the advantage
of opening the range of possible algorithms that can be used. In
particular, we choose here to use Fuzzy Decision Trees, which
are not only able to deal with symbolic and numerical classes
simultaneously, but also provide an explanation to the
prediction.
A. Fuzzy Decision Trees
Fuzzy decision trees (FDTs) are an extension of classical
decision trees. They have been introduced in Machine learning
to handle training sets that contains numerical and/or fuzzy
values [13] [14] [15]. Moreover, such trees introduced a soft
classification of examples that leads to a smoother decision.
Thus, degrees of decision and degrees of membership to
classes are provided as a result of a classification by means of
the FDTs.
The construction of a FDT from a training set T = {e1,...,en}
is based on the well-known ID3 [16] or the CART algorithms
[17]. A fuzzy decision tree is made up from its root to its leaves
by sequentially partitioning T into subsets. Each partition is
obtained from a comparison on the values of a selected
attribute. This comparison made up a node of the tree.
Let each example ei from T described by means of a set of
values for attributes A = {A1, ... , Am}. Where each attribute Aj
can take a fuzzy, numerical, or symbolic value vjl in the set
{vj1,..., vjm}. An example's description is a m-tuple of attribute
value pairs (Aj, vjl). Each description is associated to a class ck
from C = {c1,..., cK} to make up the training example ei. A
fuzzy value vjl is associated with a membership function µvjl
from T that associated to each ei of T the degree of having the
value vjl. Similarly, each ck is supposed to be associated with a
membership function µck.
At each step of the construction of the FDT, an attribute is
selected by means of a measure of discrimination, for instance,
the well-known Shannon entropy from Information theory [16],
[17], that orders the attributes according to their increasing
correlation to the C in the local training subset. The
discrimination power of each attribute is valued with regard to
the classes [18]. The attribute with the highest discriminating
power is selected to construct a node. Well-known fuzzy
measures of discrimination are the fuzzy entropy (that is an
extension of the Shannon entropy to fuzzy events) [15], and the
measure of ambiguity [13]. A new measure, the gradual
discrimination measure, has been introduced in [19]. Thi s
measure is interesting in our case because it values the
discrimination power of the values of an attribute with regards
of the values of the class and takes into account a monotonic
relation between these values if there exists (see [19] for a full
explanation on that measure).
The aim of a FDT is to classify any forthcoming example,
not necessarily present in T. To classify an example e, paths in
the FDT are followed from the root to leaves of the tree,
according to the values of the attributes of the description of e.
At each node of a path, a membership degree for e is valued
depending on the value of e for the attribute presents in the
node and the fuzzy values that label vertices going out that
node. On a path, all the membership degrees valued from the
root to the leaf are aggregated thanks to a conjunctive operator
(typically, a t-norm). The membership degrees for e obtained
for the whole leaves of the FDT are aggregated thanks to a
disjunctive operator (typically, a t-conorm). That leads to value
a membership degree for e to belong to each class c according
to the FDT. Various pairs of t-norms and t-conorms can be
used to aggregate the membership degrees. The most classical
ones are the Zadeh’s operators (minimum, maxi mum), or the
Lukasiewicz operators. More details can be found in [15]. A
FDT can also be used as a crisp decision tree: the alpha-cuts of
level 0.5 of each fuzzy membership functions are used to
replace the fuzzy sets. Such crisp use of a FDT enables the tree
to produce a single class, non fuzzy, as result of classification
of an example.
B. Baseline prediction
In order to measure the improvement obtained by our
method, we need to define a distance measure and a baseline.
To assess the extent to which we can predict the energy
production of a solar panel, we calculate the average of the
absolute values of the differences between the predicted energy
and the observed energy for the proposed models.
To enrich the analysis we propose three baselines:
Constant average prediction: we assume that the
average energy for a region and for a period of time
can be perfectly predicted, but is constant for all
period. To achieve this we compute, after the fact, the
average energy observed during the whole period.
Notice that this is an ideal point that cannot be
achieved, in real predictions conditions. Any constant
prediction will augment the proposed energy distance.
Energy tomorrow equals the one produced of today:
this is a standard method used for time series and in
particular in weather forecast prediction.
Pure weather forecast based prediction: we propose to
use the weather forecast as the predicted energy class.
This approach corresponds to the natural way we
would address the problem: “If today is going to be
sunny and on a sunny day we produce on average
energy E then today we should observe energy E.”
TABLE I. ENERGY BASED ON OBSER VED CURRENT CONDITIONS
Majority Weather
Apri l - July 2010
Nr of Days Watt-hr
Std
Deviation
Fair (day) 38 449.6 119.4
Partly cloudy (day) 12 396.9 136.3
Mostly cloudy (day) 8 261.6 114.8
Cloudy 14 149.1 96.2
Sho wers 5 70.2 122.6
Globally 77 342.6
V. DATA ANALYSIS
Between end of April and beginning of July 2010, we
collected data for 77 successive days. The average energy
produced per day was 342 watt-hour with standard deviation
178 W-h. Table I shows that roughly half the days are “fair”
and half are “cloudy” or “rainy”. As expected, “fair” days tend
to produce more energy than “partly cloudy”, which are better
than “mostly cloudy”, “cloudy”, and “shower” days in that
order. This conformity of semantic and energetic descriptions
gives us confidence that our model, and in particular the
majority aggregation process, are suitable. The variability of
the daily energy production is rather large, but more or less
constant for each category.
The accuracy of the weather prediction for the studied
period, using the standard set of categories, was of 60% (of
correct prediction at sunrise for the day). This surprisingly
small proportion can be explained by two phenomena: aversion
to risk in the prediction and mismatch of categories. In Table
II, which shows the number of forecast weather conditions, we
can observe a shift towards an increased number of rainy days
(predictions). We observed 5 “shower” days, but 37 “showers”
or “thunderstorms” predictions. This discrepancy may come
from the aversion to risk of the weather forecaster. In fact, if it
should rain for only an hour in day the weather forecast will be
“rainy day”. But our majority observation, suitable for the
energy prediction, would be sunny day, with consequent
category mismatch. Moreover, by comparing labels on Table I
and 2, we notice that the number and labeling of categories
differs in the two sets, thus more-or-less guaranteeing
mismatches. Labels appearing in the forecast do not appear in
the current weather observations. For instance there are no
“cloudy” predictions and no “sunny” forecasts. This reveals an
even more profound and structural problem: class boundaries
are fuzzy. In fact, if we predict “mostly cloudy” and we
observe “partly cloudy” it will be considered a mismatch. New
weather classes could be created by grouping labels, as for
instance “cloudy” with “partly cloudy” in an “overcast” class;
but preliminary work showed that the prediction accuracy does
not improve, because the descriptions then become too vague
or arbitrary.
TABLE II. ENERGY BASED ON FOREC ASTED CONDITIONS
Forecast at Sunrise
Apri l - July 2010
Nr of Days Watt-hr
Std
Deviation
Su nny 12 533.3 54.5
Fair (day) 10 489.2 82.2
Partly cloudy (day) 16 381.5 151.5
Mostly cloudy (day) 2 239 103
Sho wer s 8 98.8 109.6
Isolated
thunderstorms
7 380.3 79.7
Scattered
thunderstorms 19 242.8 135.4
Thu nde rstorms 3 146.7 134
Globally 77 342.6 94.4
Improved weather forecast based prediction: To increase
the prediction quality due to what is described above, the total
mismatches (no sunny day observation) were manually
matched to the closest class: sunny to fair, any thunderstorms
type to showers, etc.
VI. RESULTS
Table III shows the energy prediction difference. By
assuming that a solar panel produces more-or-less the same
(constant prediction, baseline) we observe an average
discrepancy of 152 W-hr compared with what is really
observed. If we use the naïve model that assumes that
tomorrow energy is equal to what was observed today, we
observe that difference predicted-observed is increased. This
proves that the energy tends to change rather quickly and that a
constant assumption is a good baseline not easy to beat.
If we focus our attention to the improved (with manual
match of fuzzy classes) method based only on the weather
forecast, we observe a reduction of 12% with respect to the
constant average estimation.
We used the fuzzy decision trees to predict the energy. In
this approach, we use the Salammbô software [15] to build a
FDT from the whole dataset. From a training set, the
Salammbô software provides us with a FDT with fuzzy set
values that label vertices going from a node associated with a
numerical attribute.
Numerical attributes are automatically discretized (as a
fuzzy partition) by means of the software, at each step of
selection of an attribute to build a node of the tree. Attributes to
build nodes of the FDT are selected by means of a
discrimination measure [18]. In this experiment, we use the
gradual discrimination measure introduced in [19]. The
predicted energy class has been discretized in 4 intervals, from
0 (0 to 180 W-hr) to 3 (greater than 500). The classification of
an example by means of the FDT provided a set of membership
degrees to each intervals that define the class. In order to obtain
the predicted energy of the example, median values of each
interval weighted by the corresponding membership degrees
are aggregated to provide a predicted energy.
The FDT constructed from the whole training set (77
examples) is composed of 38 paths, with a maxi mum of 7
nodes on a path, and an average number of 5.1 nodes on a
path. Some instances of paths are:
If the majority weather at sunset is mostly cloudy, and if
the temperature max is lower1 than 20 then the predicted
energy ranges from 370 to 500 (class 2).
If the majority weather at sunset is cloudy or showers, and
if the temperature min is greater2 than 9 and the weather
at sunrise is fair then the predicted energy ranges from
180 to 370 (class 1).
We recall that a path in a FDT is equivalent to a fuzzy rule:
premise of the rule is composed of the attribute values that
pertains to the path, and the conclusion of the rule is the value
of the class presents in the leaf of the path.
We investigate the validity of this approach by means of a
leave one out experiment with the whole collected data set.
Results are presented in Table III.
With a crisp use and a crisp output of the FDT, the FDT
products a single weather class as output. In that case, we can
observed (column “Crisp”) that the prediction is worse than the
baseline one.
The accuracy of energy prediction can be further improved
by taking into account the fuzzy classification provided by the
FDT. The use of FDT with either min-max t -norms or
Lukasiewicz tnor ms to aggregate the membership to the
vertices on paths from the root to the leaves (see [15]) provides
an important improvement of the prediction. The min-ma x
weighting scheme provides excellent results reaching a 33%
improvement compared to the baseline, with an average energy
difference of 106 W-hr. Good results are also obtained by
means of the Lukasiewicz weighting scheme that provides a
26% improvement compared to the baseline, with an average
energy difference of 112 W-hr.
1 Lower than 20 is a fuzzy set deduced automatically during the construction
of t he FD T. It is a piecewise linear membership functio n with a support
equals to (-, 21] and a kernel equals to (-, 19].
2 Greater than 9 is a fuz zy s et deduced automatically during the construction
of the FDT. It is a piecewise linear members hip function with a support
equals to [7, +) and a kernel equals to [11, +).
TABLE III. AVER AGE ENERGY DIFFERENCE BET WEEN THE DIFFERENT
PREDICTION MODELS, COMPARED TO BASELINE (BEST CONSTANT PREDICTION)
Predictio n Models Comparison
Fuzzy Decision Trees
Best
Consta nt
(baseline)
Today
equals
Tom orrow
Improved
Weat her
Fore cast
Cri sp Min-max
Lukasi-
ewicz
norms
Average
Ener gy
Diffe rence
(watt -hr)
152 170 134 223 106 112
Com pare d
to ba sel ine -- worse -12% wo rse -30% -26%
VII. CONCLUSIONS AND F UTURE WOR K
The use of the weather forecast service allows improving
the energy production prediction. It not only improves
compared to any fixed prediction (based on average of other
studies), but also compared to a naïve sequential approach.
Since the weather forecast is wrong forty percent of the
time - based on the predictor's own categories it is necessary
either to manually add coherence by realigning the fuzzy
categories or use a machine learning algorithm (as here the
fuzzy decision trees) to automatically discover the underlying
rules. These rules can be used in a second step to setup efficient
controllers, as for instance fuzzy Takagi Sugeno ones. But it is
important to point out that without such a study, any controller
would perform poorly, due to complex relationship existing
between weather class, weather forecast and energy production.
One of future works should focus on comparing the
performance of this approach with other regression algorithms,
such as neuronal networks although the problem of the
symbolic weather classes remain a challenge. The interest will
be, not only to compare the performance with a dedicated
blackbox, but also, on addressing the challenge of
incorporating knowledge in these types of systems, improving
the overall performance. Another potential possibility is to test
prediction techniques that include temporal evolution, as for
example Markov models. Improved prediction models could
take advantage of available data sources not incorporated into
this first attempt at analysis, e.g., the recorded images of the
sky and the locally measured reported temperature: allowing to
correct national versus local measurement bias.
Other future work might focus on more complex but more
practical setups, for instance, sun-tracking panels, integration
with storage batteries, etc. We believe that sun tracking will not
dramatically change the conclusions of this work; though of
course it will improve absolute collection efficiency. Storage
batteries are obviously advantageous in that they give the
system designer control over several time scales that are
otherwise only in nature's hands, but with these additional
handles comes additional complexity and uncertainty.
REFERENC ES
[1] Yahoo! Weather RSS Feed [Online]. Available:
http://developer.yahoo.com/weather/ (accessed: 2010, Jan)
[2] Dropbox Docu mentation [Online]. Available:
https://www.dropbox.com/about (accessed: 2010, Jan)
[3] Arduino Duemilanuove Datasheet [Online]. Available:
http://www.arduino.cc/en/Main/Arduino BoardDuemilanove (accessed:
2010, Jan)
[4] Peder Bacher, Henrik Madse n, Henrik Aalborg Nielson, “Online short-
term solar power forecasting”, Informatics and Mathematical Modelling,
Richard Pedersens Plads , Technical University of Denmark, Denmar k,
22 May 2009.
[5] Lin Phyo Naing Srinivasan, D., “Estimation of solar power generating
capacity”, IEEE 11th International Confere nce on Probabilistic Methods
Applied to Power Systems (PMAPS), 14-17 June 2010, Singapore
[6] Hong-Tzer Ya ng, Jia n-Ta ng Liao, Xiang-He Su , A fu zzy -rule based
power restoration approach for a distribution system with renewable
energies”, FUZZ-IEEE 2011: 2448-2453
[7] Davide Caputo, Francesco Grimaccia , Marco Mussetta, Riccardo Enrico
Zic h, “Photovoltaic plants predictive model by means of ANN trained
by a hybrid evolutionary algorithm”, IJCNN 2010: 1-6
[8] Francesco Grimaccia, Marco Mussetta, Riccardo Enrico Zich, Neuro-
fuzzy predictive model for PV energy p roduction based on weather
forecast”, F UZZ-IEEE 2011: 2454-2457
[9] Irwa n Purnama, Y u-Ka ng Lo, Hua ng-Jen Chiu, A fu zzy cont rol
maximum power point tracking photovoltaic system”, FUZZ-IEEE
2011: 2432-2439
[10] Esram, T., Chapman, P.L., "Comparison of Photovoltaic Array
Maximum Power Point Tracking Techniques", IEEE Transactions on
Ene rgy Co nversi on, Vol. 22 (2) pp. 439-449, 2007
[11] Chu ng -Yuen Won, Duk-Heon Kim, Sei-C han Kim, Wo n-Sam Kim and
Hack-Sung Kim, "A new maximum power point tracker of photovoltaic
arrays using fuzzy controller", 25th Annual IEEE Power Electronics
Specialists Conf. (PESC'94), pp. 396-403, Taipei, Taiwan, Jun 1994.
[12] Kyohei Kurohane, To monobu Senjyu, Atsus hi Yona, Naomitsu Urasaki,
Tomonori Goya, Tos hihisa Funabashi: A Hybrid Smart AC/DC P ower
System. IEEE Trans. Smart Grid 1(2): 199 -204 (2010)
[13] Yuan, Y. & Shaw, M. Induction of Fuzzy Decision Trees Fuzzy Sets a nd
systems, 1995, 69, 125-139.
[14] Janikow, C. Z. Fuzzy Decision Trees: Issues and Methods IEEE
Transactions on Systems, Man and Cybernetics, 1998, 28, 1-14.
[15] Marsala, C. & Bouchon-Meunier, B. An Adaptable System to Co nstruct
Fuzzy Decision Trees Proc. of the NAFIPS'99, 1999, 223-227.
[16] Quinlan, J. R. Induction of Decision Trees Machine Lear ning, 1986, 1,
86-106.
[17] Breiman, L.; Friedman, J.; Olshen, R. & Stone, C. Classificatio n And
Regression Trees Chapman and Hall, 1984.
[18] Marsala, C. & Bouchon-Meunier, B. Ranking Attrib utes to Build Fu zzy
Decision Trees: a Comparative Study of Measures IEEE World
Congress on Computational Intelligence, 2006, 1777-1783.
[19] Marsala, C. Gradual Fuzzy Decision Trees to Help Medical Diagnosis.
IEEE World Co ngress on Computational Intelligence, 2012, Brisba ne,
Australia, June 2012 (to appear)
... The study found that the WPP approach using NN outperformed all other methods, showcasing the advantage of considering the relationship between consecutive days' weather characteristics. Detyniecki, et al. (2012) used fuzzy decision trees (FDTs) to improve the accuracy of predicting solar panel energy production based on weather forecasts. The study involved a standard home solar panel subjected to real weather conditions. ...
Conference Paper
Full-text available
The importance of green energy in the current times cannot be understated. The ill effects of traditional forms of energy generation, make solar energy one of the most environmentally friendly alternatives. SAI MITHRA-a multi-capacity solar energy generation system, executed by Sri Sathya Sai Central Trust at Prasanthinilayam in South India is a prime example for promotion of green energy. This study attempts to understand the impact of weather variables on solar energy production across different production capacities using high frequency daily data. The study identifies the important weather variables that have an impact on solar energy production during different seasons. In order to provide predictive insights, the impact of weather variables with t-1 and t-2 day lags on solar energy generation have also been studied. The insights from the paper are relevant for multi-capacity solar energy systems for improving operational efficiencies and promoting green energy ecosystems. Introduction:
... Adaboost [32,33]; Convolutional Neural Network [23,34]; Decision Tree [35,36]; Extreme Learning Machine [37,38]; K-Nearest Neighbour [24,39]; Multi-layer Perceptron [40,41]; Random Forest [19,42]; Support Vector Regression [43][44][45]; and, Transparent Open Box [9,25]. For details concerning the methodologies of each method readers are referred to the publications cited, adapted by the control parameters listed in Table 3. ...
Article
A large (43824 h) country-wide solar power load profile (LP), solar irradiance and meteorological dataset (ten variables) for Germany covering years 2015 to 2019 is compiled and forecast with eight machine learning and deep learning (ML/DL) algorithms. Analysis reveals that system curtailments are likely responsible for some outlying predictions. The adaptive boosting (ADA) and random forest (RF) algorithms outperform convolutional neural networks and other algorithms in supervised prediction and forecasting tasks with the dataset. Once tuned with 2015 to 2018 data, ADA and RF forecast 2019 hourly data with root mean squared error (RMSE) of < 0.03. Similar forecasting accuracy is achieved using smaller datasets of three months of historical data from 2015 to 2018 to forecast hourly LP in each month of 2019. Forecasts for July 2019 are associated with the highest errors (RMSE = 0.41). The transparent open box (TOB) algorithm, because it can reveal details of data matching contributions to each of its forecasts, is used to data mine the July 2019 forecasts and conduct in-depth outlier analysis. It reveals nine extreme outliers that each overestimate LP based on irradiance and meteorological inputs. Analysis suggests that substantial overestimates leading to high RMSE for July 2019 are likely due to system curtailments. It is beneficial to combine less transparent (e.g., ADA and RF) with more transparent (e.g. TOB) ML algorithms to accurately forecast and data mine large solar power data sets. It also indicates that solar LP forecasts cannot always rely on irradiance and meteorological variables in isolation. There is a need to be mindful of system constraints and market conditions when predicting LP on a country-wide basis.
... Using solar energy as a distributed energy resource, it is possible to minimize the transmission loss and supply energy to the consumer more efficiently. However, the solar panel has a problem in that it is sensitive to changes in the surrounding environment, and it is difficult to arbitrarily control the amount of energy generation [3,4]. ...
Article
Full-text available
In this paper, to balance power supplement from the solar energy’s intermittent and unpredictable generation, we design a solar energy generation and trading platform (EggBlock) using Internet of Things (IoT) systems and blockchain technique. Without a centralized broker, the proposed EggBlock platform can promote energy trading between users equipped with solar panels, and balance demand and generation. By applying the second price sealed-bid auction, which is one of the suitable pricing mechanisms in the blockchain technique, it is possible to derive truthful bidding of market participants according to their utility function and induce the proceed transaction. Furthermore, for efficient generation of solar energy, EggBlock proposes a Q-learning-based dynamic panel control mechanism. Specifically, we set the instantaneous direction of the solar panel and the amount of power generation as the state and reward, respectively. The angle of the panel to be moved becomes an action at the next time step. Then, we continuously update the Q-table using transfer learning, which can cope with recent changes in the surrounding environment or weather. We implement the proposed EggBlock platform using Ethereum’s smart contract for reliable transactions. At the end of the paper, measurement-based experiments show that the proposed EggBlock achieves reliable and transparent energy trading on the blockchain and converges to the optimal direction with short iterations. Finally, the results of the study show that an average energy generation gain of 35% is obtained.
... These artificial intelligence techniques have all been used in some form to forecast solar power-related datasets, for example ADA [41,42]; CNN [32,43]; DT [44,45]; ELM [46,47]; encoder-decoder [33,48]; GRU [49,50]; KNN [28,51]; LSTM [30,[52][53][54]; MLP [55,56]; MLR [22,57,58]; RF [25,59]; SVR [60][61][62]; and XGB [63]. The majority of these cited studies have applied a limited number of these methods to specific datasets. ...
Article
An attribute technique is applied to forecast countrywide solar capacity. Attributes relate to the prior 12 h of a univariate, hourly time series. The approach avoids uncertainties relating to weather-related variables averaged at the country level. It captures impacts of system curtailments due to abnormal market conditions or grid-offtake limitations. Fifteen attributes relating to each hourly record are input to machine/deep learning (ML/DL) models. 43,824 h of solar capacity factor for Britain from 2015 to 2019 is evaluated. Fifteen ML/DL models are trained with 2015–2018 data with cross-validation. Trained models are then applied to forecast unseen 2019 hourly data. The ML/DL model forecast accuracy is compared with that of ARIMA and regression models. Extreme gradient boosting, random forest and adaptive boosting models outperform ARIMA and regression methods in forecasts for hours t0 to t + 12. Those three ML models are more accurate and faster to execute than six DL models evaluated. Suboptimal convergence and/or overfitting hinder the forecasts of DL models with unseen data. A transparent multi-linear regression model is used to identifying attribute influences on the different time period forecasts. The trend attributes are shown to influence the forecasts for different hours ahead in distinct ways.
Chapter
In order to satisfy the growing world energy demand and decreasing the emission of greenhouse gases the requirement for more green energy has led to an increased focus on research related to forecasting solar energy recently. In this study we aim to develop forecast models, based on Artificial Neural Network and Random Forrest algorithms to predict daily solar energy based on daily historical meteorological data measured between 2019 and 2021. The accuracy and the performance of each model are compared using mean squared error, mean absolute percentage error, mean absolute error, max error and R-squared for evaluation. The prediction of daily solar energy from the daily maximum, minimum and average values of the metrological variables using Artificial Neural Network and Random Forest was carried out. The results obtained indicate that both models can predict daily solar energy with good accuracy (MAPE = 13%). On the one hand, the RF model showed excellent accuracy during the training phase (MAPE = 8%, R2 = 0.97), but it failed to show same results during the testing phase (MAPE = 13%, R2 = 0.79). On the other hand, the ANN was able to maintain the same results during training and testing (MAPE = 13%, R2 = 0.81).
Chapter
Full-text available
In this work, we will study the 1D transport of contaminants in a saturated porous medium which can be presented by different phenomena, such as advection, diffusion, and reaction. The system of the three equations linked together is considered a difficult system to solve since each equation has its stability and convergence condition. Therefore, our objective is to develop a new strategy that will allow us to solve this kind of problem and obtain more effective results, to compare the two procedures of this approach method and specify the best procedure for modeling this type of system. The method utilized here is the operator splitting method, which is a good method to solve these kinds of complicated models. The main idea behind this strategy is to split down a complex problem into smaller subsystems, known as division sub-problems, and solve each one individually using the appropriate numerical method. The effects of operator splitting methods on the solution of advection-diffusion-reaction are examined, within the context of this works two operator splitting methods, Lie-Trotter and Strang-Marchuk splitting methods were used and comparisons were made through various decomposition rate. Obtained results were compared with analytical solutions to the problems and available methods in the literature. It is seen that the Lie-Trotter splitting method has lower error norm values than the Strang-Marchuk splitting method. But, the Lie-Trotter splitting method produces accurate results for very small values of the numerical result for an application concerning the transport of a contaminant will be presented to enhance the value of our results, and prove the efficiency of the LTM (Lie-Trotter Method).
Article
Full-text available
This study investigates the effect produced by various types of cloudiness on the functioning of a photovoltaic system in the central part of the Republic of Sakha (Yakutia). The electric power efficiency of the photovoltaic system under various cloudiness conditions was assessed using graphical interpretations, measuring and recording devices, as well as a description of the procedure for conducting experimental work. The average indicators of a decrease in the electric power efficiency of the photovoltaic system were determined using patterns for a certain type of cloudiness. A specific cloudiness type was identified by performing measurements and calculating illumination ranges, taking boundary conditions into account. These studies were carried out during the summer period of 2021 using the facilities of the mobile test site of the V.P. Larionov Institute of the Physical-Technical Problems of the North of Siberian Branch of the Russian Academy of Sciences located in the central part of the Republic of Sakha (Yakutia). Control parameters of alterations in the generating capacity of the photovoltaic system were obtained for 10 types of cloudiness. The obtained parameters can be used when modeling operational processes and performing engineering calculations of the operating modes for solar power plants. According to the results, during the operation of photovoltaic systems under various types of cloudiness, the decrease in the generating capacity of the installation can vary within 8–95% relative to the generating capacity indicator under clear weather. The obtained indicators of alterations in the generating capacity of a photovoltaic system under various cloudiness conditions can be applied for developing a methodology for assessing the effect of cloudiness and its types on the carrying capacity of solar beams falling on the photovoltaic panel surface, as well as to more accurately determine the energy potential of solar generation in a certain area.
Chapter
Full-text available
This paper describes a new approach to online forecasting of power production from PV systems. The method is suited to online forecasting in many applications and in this paper it is used to predict hourly values of solar power for horizons of up to 36 h. The data used is 15-min observations of solar power from 21 PV systems located on rooftops in a small village in Denmark. The suggested method is a two-stage method where first a statistical normalization of the solar power is obtained using a clear sky model. The clear sky model is found using statistical smoothing techniques. Then forecasts of the normalized solar power are calculated using adaptive linear time series models. Both autoregressive (AR) and AR with exogenous input (ARX) models are evaluated, where the latter takes numerical weather predictions (NWPs) as input. The results indicate that for forecasts up to 2 h ahead the most important input is the available observations of solar power, while for longer horizons NWPs are the most important input. A root mean square error improvement of around 35% is achieved by the ARX model compared to a proposed reference model.
Conference Paper
In this paper, we consider the problem of the construction of fuzzy decision trees when there exists a graduality between the values of attributes and values of the class. We propose a new measure, extended from the measure of classification ambiguity, that takes into account both discrimination power and graduality with regards to the class. To highlight the importance of that kinds of measures, Medical applications is presented in which often the values of the class are symbolic and ordered and in which the discovery of gradual links between descriptive attributes and the class are seek for.
Conference Paper
The construction of decision trees is an efficient tool for inductive learning, and fuzzy decision trees are particularly interesting because they enable the user to take into account imprecise descriptions of the cases, or heterogeneous values (symbolic, numerical, or fuzzy). However, since the method to construct a fuzzy decision tree is not unique, in this paper, a comparative study is presented to point out differences between three methods. This study focus on differences between methods when ranking attributes during the construction of a fuzzy decision tree. The aim is to enable the reader to understand what kind of fuzzy decision tree is obtained by each method.
Conference Paper
This paper introduces a hybrid evolutionary optimization algorithm as a tool for training an Artificial Neural Network used for production forecasting of solar energy PV plants. This hybrid technique is developed in order to exploit in the most effective way the uniqueness and peculiarities of two classical optimization approaches, Particle Swarm Optimization (PSO) and Genetic Algorithms (GA). This procedure essentially represent a bio-inspired heuristic search technique, which can be used to solve combinatorial optimization problems, modeled on the concepts of natural selection and evolution (GA), but also based on cultural and social behaviours derived from the analysis of the swarm intelligence and interaction among particles (PSO). Some simulation results are reported to highlight advantages and drawbacks of the proposed technique in order to suitably apply this algorithm to neural network applications in engineering problems.
Conference Paper
Solar energy is one of the most promising renewable energy sources. In order to integrate this type of source into an existing power distribution system, system planners need an accurate model that predicts the availability of the generating capacity. Solar resources are known to exhibit a high variability in space and time due to the influence of other climatic factors such as cloud cover. The probability distribution of irradiance fluctuations is difficult to predict due to various uncertainties. For efficient conversion and utilization of the solar resource, the solar resource modelling is one of the most essential tools for proper development, planning, maintenance scheduling and pricing of solar energy system. This paper proposes the Mathematical and Neural Network Prediction models for estimation of solar radiation for Singapore. Meteorological and geographical data (latitude, longitude, altitude, month, mean sunshine duration, etc.) were used as inputs to the models. The estimated results are compared with the field data obtained from the pyranometer installed on the solar panel with a tilt of 15°. The relevance and performance of each model in Singapore's weather context is then evaluated using statistical tools, namely Mean Bias Error, Root Mean Squared Error and Mean Absolute Percentage Error. The results show that the correlation coefficients between the proposed model and the actual daily solar radiation were higher than 90%, thus suggesting a high reliability of the model for evaluation of solar radiation received in Singapore. These models can be used easily for estimation of solar radiation for preliminary design of solar applications.
Article
Recently, smart grids are attracting attention. Already, a smart grid based on an AC grid is proposed. However, no study on research is presented or published on a smart grid based on a dc grid. This paper presents an ac/dc hybrid smart power system. The proposed system has advantages of both dc and ac grids. The proposed power system consists of a wind generator and several controllable loads. The controllable loads have different capacities. Therefore, by applying power consumption control with the droop characteristic, the dc bus voltage is maintained within the acceptable range. As controllable loads, electric water heater and electric vehicle are assumed. Effectiveness of the proposed method is verified by numerical simulation results.
Article
Most decision tree induction methods used for extracting knowledge in classification problems do not deal with cognitive uncertainties such as vagueness and ambiguity associated with human thinking and perception. In this paper cognitive uncertainties involved in classification problems are explicitly represented, measured, and incorporated into the knowledge induction process. A fuzzy decision tree induction method, which is based on the reduction of classification ambiguity with fuzzy evidence, is developed. Fuzzy decision trees represent classification knowledge more naturally to the way of human thinking and are more robust in tolerating imprecise, conflict, and missing information.