Conference PaperPDF Available

Weather-based solar energy prediction

June 2012
IEEE International Conference on Fuzzy Systems

June 2012

DOI:10.1109/FUZZ-IEEE.2012.6251145

Conference: Fuzzy Systems (FUZZ-IEEE), 2012 IEEE International Conference on

Authors:

Marcin Detyniecki

AXA GROUP

Ashwati Krishnan

Carnegie Mellon University

Mel Siegel

Carnegie Mellon University

Photovoltaic solar panels are effective energy sources during periods of bright sunlight. Excess energy can be stored for later use at night or on cloudy days. The decision to use the stored energy now or later depends largely on being able to predict the weather on different timescales. Short term prediction of stored energy is challenging due to the non-trivial I-V characteristic of the solar cell. The erratic nature of the weather makes long term predictive energy management difficult. In this paper, we address these issues based on data collected from a solar panel, as well as its relationship to observations made of the weather. We observe that prediction, based on fuzzy decision trees, reduces the energy error by 22% compared to a constant prediction equal to the average on the studied period. Thus, exploiting the fuzzy classification provided by a fuzzy decision tree is a good improvement compared to the baseline.

Ideal solar power panel. I sc is the short circuit current (when load resistance R L =0) and V OC is the open circuit voltage (when R L = ∞). The black curve is the I-V characteristic, the gray curve is the power available to an external load (the IV product), and the dashed-blue load line finds the operating point (V,I)-marked by the red cross-at which the load will extract maximum possible power from the panel.

…

I-V of our solar panel at four times on a typical day. Load resistors between 5 and 155 ohms in 5 ohm steps are switched in random order across the panel, voltage across the load is measured and current is calculated from voltage and resistance. Open-circuit voltage is also measured and recorded as the zero-current value. A diagrammatic representation of the setup is shown and explained in Figure 3.1.

…

Hardware diagram. From left to right, solar panel, serial load resistor array in series with disconnect-relay (to measure panel's open-circuit voltage), transistor-buffered resistor-shorting relays, voltage divider, Arduino (performing analog voltage input measurement and controlling load-re istorshorting relays)

…

Figures - uploaded by Ashwati Krishnan

Content may be subject to copyright.

Content uploaded by Ashwati Krishnan

Content may be subject to copyright.

Weather-Based Solar Energy Prediction

Marcin Detyniecki

Université Pierre et

Marie Curie

LIP6 – UPMC – CNRS

4 place Jussieu, 75005

Paris, France

Marcin.Detyniecki@lip6.fr

Christophe Marsala

Université Pierre et

Marie Curie

LIP6 – UPMC

4 place Jussieu, 75005

Paris, France

Christophe.Marsala@lip6.fr

Ashwati Krishnan

Dept of Elect. & Comp .

Engineering (ECE)

Carnegie Mellon

University

Pittsburgh PA USA

ashwatik@andrew.cmu.edu

Mel Siegel

The Robotics Institute

Carnegie Mellon

University

Pittsburgh PA USA

mws@cmu.edu

Abstract— Photovoltaic solar panels are effective energy sources

during periods of bright sunlight. Excess energy can be stored for

later use at night or on cloudy days. The decision to use the

stored energy now or later depends largely on being able to

predict the weather on different timescales. Short term

prediction of stored energy is challenging due to the non-trivial I-

V characteristic of the solar cell. The erratic nature of the

weather makes long term predictive energy management

difficult. In this paper, we address these issues based on data

collected from a s olar panel, as well as its relations hip to

observations made of the weather. We observe that prediction,

based on fuzzy decision trees, reduces the energy error by 22%

compared to a cons tant prediction equal to the aver age on the

studied period. Thus, exploiting the fuzzy classification provided

by a fuzzy decision tree is a good improvement compared to the

baseline.

Keywords: solar energy; photovoltaic; power utilization

planning; weather; energy prediction; fuzzy decision trees.

I. INTRODUCTION

The availability of solar energy is not guaranteed at any

particular place or time: it depends, of course, on time-of-day,

but also on the weather conditions that prevail and that

prevailed recently. Since meteorological agencies provide

detailed weather forecasts round-the-clock, we should be able

to use their predictions to our advantage in planning activities

that require solar energy. Three interesting questions are

apparent: (1) given the standard weather forecasts available

today, can we reliably predict the energy we will be able to

capture tomorrow? (2) given our own measured actual

insolation and other local weather conditions right now, to

what extent can we make that prediction? and (3) what is the

optimal approach to fusion of these two prediction sources?

The answers to these questions are not only of academic

interest but also of crucial practical importance [6].

Contemporary solar panels are series-arrays of silicon

photovoltaic cells that are essentially large-area silicon pn-

junctions. Incident optical photons promote electrons from the

valence to the conduction band. The band-gap voltage across

the junction capacitance thus has the potential to drive DC

current through an external load. The cell's open-circuit voltage

is essentially the band-gap. Its short-circuit current depends -

not necessarily simply - on the incident optical power. Their

ratio is the internal impedance to which an external load must

be matched to achieve maximum energy transfer to the load.

Optimally extracting short-term power and long-term

energy is thus a complicated business that requires active real-

time control intelligently based on knowledge of present

requirements and an ability to predict and plan for future

requirements [7], [8].

Presenting how a fuzzy prediction method, and in particular

the Fuzzy Decision Trees (FDTs) can improve energy

prediction accuracy, prediction is this paper's main goal. We

have chosen, for this early attempt in estimating the energy

gain on real conditions, to use FDTs, in contrast to other

approaches such as neuronal networks or other regression

techniques, because FDTs produce human understandable rules

that will allow us, in the future, to improve the system. In fact,

not only relevant variables are automatically indentified, but

also their interaction is identified. Moreover FDTs have the

advantage to be able to handle simultaneously symbolic (here

weather classes such as cloudy, sunny, thunderstorm) and

numerical ones (such as temperature).

In order to achieve real conditions we used a standard solar

panel for home use, described in Section II. We placed the

panel in real conditions and collected I-V data with a dedicated

electronic apparatus and weather conditions and forecast from

the national service over the Internet, as presented in Section

III. In the following section we briefly present the Fuzzy

Decision Trees and how training and testing was performed.

Section V and VI are dedicated, respectively, to data and

results analysis.

II. SOLA R PAN ELS

Solar cells are connected in series to build solar modules or

panels. Panels generally consist of 28 to 36 cells in series to

produce 12VDC under defined illumination conditions. An

ideal solar panel current-voltage (I-V) curve is shown in Figure

1.1. For any real panel there is a continuous family of these

curves wherein open-circuit voltage increases with illumination

level and current-droop increases with decreasing illumination.

Thus optimum transfer of solar power to an external load

requires matching the load impedance to the illumination level.

Figure 1.2 shows a family of I-V curves for our solar panel

collected during 5-hour period when insolation was changing.

Notice especially the variations in curve scale and shape, and,

based on the teaching of Figure 1.1, the consequent variation of

available power and optimum load to extract it.

Figure 1.1: Ideal solar power panel. Isc is the short circuit current (when

load resistance RL=0) and VOC is the open circuit voltage (when RL = ∞).

The black curve is the I-V characteristic, the gray curve is the power

available to an external load (the IV product), and the dashed-blue load

line finds the operating point (V,I) - marked by the red cross - at which the

load will extract maximum possible power from the panel.

A. Getting the maximum power out of a panel (MPPT)

Consider a system where the load is connected directly

across the solar panel. Its maximum power point (MPP) is the

point on the I-V curve where the area under the curve is

maximum, as shown in Figure 1.1. For optimal simplicity and

efficiency one should choose a solar panel that perfectly

matches the intended load. But this is not possible: the I-V

curve - hence the MPP - changes with illumination. It also

changes with panel temperature, which also depends in part on

illumination. Active measuring and switching power

converters, called maximum power point trackers (MPPT), can

switch the load so as to keep the operating point at the MPP.

Seve ral solutions, in particular based on fuzzy control, have

been proposed [11] and are still under investigation [9]. A

complete comparison can be found in [10].

On the one hand this is simple, on the other hand it is

daunting. If all we want to do is, say, toast bread, then it is easy

enough to switch the resistance of the heating element; the

toasting time changes with illumination level, but within

reasonable limits we still make toast. But for the vast majority

of practical loads - "appliances" - it is impossible to flexibly

and efficiently trade off voltage rating and current demand. We

thus anticipate a critical near-future demand for active power

converters that will accommodate a plausible range of

fluctuating DC input voltages and deliver stable standard DC or

AC output voltages without incurring unacceptable losses [12].

Note that the control algorithm required for MPPT is non-

trivial. The MPP is not known a priori, and it moves with

variations in illumination and temperature. In practice perturb-

and-observe (P&O) algorithms are employed [10], despite the

objection-in-principle that when the system is actually

optimized any perturbation is guaranteed to reduce efficiency.

Clearly the scale of the integral term in the control algorithm is

crucial, and should itself be dynamic, as the system needs on

the one hand to respond rapidly to fast changes in illumination

level, e.g., passing clouds, and on the other hand it must not

spend too much efficiency hunting when conditions are

changing only slowly, e.g., on cloudless days.

Figure 1.2: I-V of our solar panel at four times on a typical day. Load

resistors between 5 and 155 ohms in 5 ohm steps are switched in random

order across the panel, voltage across the load is measured and current is

calculated from voltage and resistance. Open-circuit voltage is also

measured and recorded as the zero-current value. A diagrammatic

representation of the setup is shown and explained in Figure 3.1.

III. SOLAR ENERGY UNDER REAL WEATHER CONDITIONS

The National Renewable Energy Laboratory recommends

that solar panels be characterized under standard test conditions

(STC): temperature 25 C and illumination 1000 W/m2 (1.0 sun)

with an air mass 1.5 (AM1.5) filtered solar spectrum. The idea

is to match the illumination and spectrum of sunlight incident

on a clear day on a sun-facing 37°-tilted surface with the sun at

an angle of 41.81° above the horizon. This condition - with the

panel aimed directly at the sun - geo metrically approximates

solar noon near the spring and autumn equinoxes in the

continental United States. However insolation at the earth's

surface is rarely as large as the prescribed 1000 W/m2. And, as

already noted, to realistically study electrical energy generation

under realistic weather conditions, realistic fluctuations in

lighting and temperature must be observed. Note also that a

panel that is optimal in the NREL environment is almost

certainly suboptimal in any natural environment. So to study

solar energy production with practical goals under natural

weather conditions it is advisable to combine the solar panel

with an MPPT. But there are many such commercial devices,

each one running some undisclosed proprietary algorithm, none

of them arguably best or even in any sense standard. Thus we

elect to organize our measurements in a way that allows us to

simulate an ideal MPPT algorithm – that is collect all possible

data first and compute after the fact the real optimal point.

Figure 3.1: Hardware diagram. From left to right, solar panel, serial load resistor array in series with disconnect-relay (to measure panel's open-circuit

voltage), transistor-buff ere d resistor -shorting relays, voltage divider, Arduino (performing analog voltage input measurement and controlling load-re ist or -

shorting relays)

A. Instrumentation

We studied the response of our solar panel - approximately

32 cm x 60 cm, so approximately 0.19 m2 - using a simple

single-board data acquisition system in communication with a

dedicated laptop computer that is in turn in communication

with the internet. Our panel is an off-the-shelf unit mounted at

a tilt-angle of approximately 40o outside an approximately

south-facing window with a reasonably clear view of the sun's

path most of the day, most of the year. The panel's pointing and

tilting are probably never perfectly optimal, but are a good

compromise that receives better-than-average solar radiation

throughout the year. Data acquisition and control are provided

by an Arduino Duemilanouve (2009) [3], a low-cost easy-to-

program open-design board that provides convenient access to

the ATMega168 microcontroller's digital I/O, 10-bit analog

input, PWM output, and serial communication pins. A program

written in a C-like language using a simple API on a PC is

mor e-or-less invisibly compiled and downloaded via a USB

channel on which data are subsequently also returned. Digital

output pins are transistor-buffered and diode-protected to safely

switch the coils of relays that short-out a series-array of {5, 10,

20, 40, 80} ohm power resistors to provide 5 to 155 ohm load

in 5 ohm steps - plus open-circuit - across the solar panel. A

measurement sequence is initiated and recorded every 10

minutes. Independently but also every 10 minutes, a USB

webcam captures a sky picture. The data files and sky pictures

are stored "in the cloud" using Dropbox [2]. As a practical

matter, the ATMega168's ADC's rudimentary analog input

circuitry and 10-bit resolution do not provide precise or

accurate measurements. But they do appear to be stable, which

is all that is really required for the present experiments,

wherein we are interested primarily in reaching qualitative

conclusions. Of course, since the measurements do seem to be

stable, after-the-fact calibration can be undertaken if

subsequently it seems valuable.

B. Weather Forecast

The solar panel and its Arduino-plus -Windows-laptop

based monitoring system are located at an off-campus location,

which is secure and has a good south-looking view with a large

open-sky solid-angle. On campus another Windows PC that has

reliable access to the Internet periodically downloads present

and predicted weather information from the National Oceanic

and Atmospheric Administration (NOAA) through the Yahoo!

Weather RSS Feed [1], in the form of XML files. Since,

weather conditions tend to vary slowly, we recorded the

weather conditions every hour, every day. In order to be able to

match forecast with current condition, we used the 48 standard

categories provided by the weather service. To mini mize the

prediction error, we choose here to use the forecast just before

sunrise. Other more complicated methods could take into

account the evolution or tendency of the forecast.

C. Data aggregation

The question of how to aggregate the data may seem simple

at first sight, but it is in fact extremely complex. We choose to

work on a one-full-day basis, because it provides a natural,

regular cycle. Further works could deal with energy prediction

with a shorter or a longer time horizon. Hence to compute the

energy produced by the panel over one day we need to start

from the power measurement obtained every 10 minutes. Our

first step consists of choosing from each of these series the

maximum power. In this way we simulate an ideal MPPT.

Then under the assumption that everything remains equal for

the following ten minutes we integrate over the whole day to

obtain the total energy produced. The assumption introduces an

error for quickly changing conditions (as for instance a sunny

day with some clouds). In fact, the measurement could have

been done when the cloud is just over the panel. We believe

that the introduced error averages out because of the frequency

and the uniform nature of the sampling. In fact, if there are a lot

of clouds, more often than not the measurement will be done

under reduced illumination approximately proportional to the

average coverage. Since weather conditions fluctuate during

the day, to obtain a global “for the day” weather classification,

we choose to aggregate by majority vote all the classifications

of the National Weather Service reported during the daylight

hours of that day. In other words, we choose to label the day

based on the most frequent NWS classification; and we focus

our attention only on the hours when there should be light

(between sunrise and sunset). So, if it rains for only one hour

during the day and it was, for the rest, a sunny day, it is labeled

as a sunny day (notice that this is not the case for weather

services). Although the solar panel data, the sky pictures, and

the downloaded weather data are not perfectly synchronized,

for the purpose and nature of the experiments described their

imprecise - and occasionally inconsistent - alignment is

inconsequential.

IV. ENERGY AND WEATHER PRED ICT IO N

Based on the data described above the challenge is to

predict, before the sun rises, the energy that we will by

produced during that day. All methods can be grouped in two

large families: The direct ones, where the energy value is

computed by a “black-box” algorithm (usually regression like,

as for instance Neuronal Networks [7] [8]) and the indirect

ones, where first a weather class is predicted and based on it an

average electricity is predicted.

In this paper, we choose to explore the performance of the

latter. This approach allows using the power of the national

weather forecast services, without any further modifications, to

predict the energy. The general formulation has the advantage

of opening the range of possible algorithms that can be used. In

particular, we choose here to use Fuzzy Decision Trees, which

are not only able to deal with symbolic and numerical classes

simultaneously, but also provide an explanation to the

prediction.

A. Fuzzy Decision Trees

Fuzzy decision trees (FDTs) are an extension of classical

decision trees. They have been introduced in Machine learning

to handle training sets that contains numerical and/or fuzzy

values [13] [14] [15]. Moreover, such trees introduced a soft

classification of examples that leads to a smoother decision.

Thus, degrees of decision and degrees of membership to

classes are provided as a result of a classification by means of

the FDTs.

The construction of a FDT from a training set T = {e1,...,en}

is based on the well-known ID3 [16] or the CART algorithms

[17]. A fuzzy decision tree is made up from its root to its leaves

by sequentially partitioning T into subsets. Each partition is

obtained from a comparison on the values of a selected

attribute. This comparison made up a node of the tree.

Let each example ei from T described by means of a set of

values for attributes A = {A1, ... , Am}. Where each attribute Aj

can take a fuzzy, numerical, or symbolic value vjl in the set

{vj1,..., vjm}. An example's description is a m-tuple of attribute

value pairs (Aj, vjl). Each description is associated to a class ck

from C = {c1,..., cK} to make up the training example ei. A

fuzzy value vjl is associated with a membership function µvjl

from T that associated to each ei of T the degree of having the

value vjl. Similarly, each ck is supposed to be associated with a

membership function µck.

At each step of the construction of the FDT, an attribute is

selected by means of a measure of discrimination, for instance,

the well-known Shannon entropy from Information theory [16],

[17], that orders the attributes according to their increasing

correlation to the C in the local training subset. The

discrimination power of each attribute is valued with regard to

the classes [18]. The attribute with the highest discriminating

power is selected to construct a node. Well-known fuzzy

measures of discrimination are the fuzzy entropy (that is an

extension of the Shannon entropy to fuzzy events) [15], and the

measure of ambiguity [13]. A new measure, the gradual

discrimination measure, has been introduced in [19]. Thi s

measure is interesting in our case because it values the

discrimination power of the values of an attribute with regards

of the values of the class and takes into account a monotonic

relation between these values if there exists (see [19] for a full

explanation on that measure).

The aim of a FDT is to classify any forthcoming example,

not necessarily present in T. To classify an example e, paths in

the FDT are followed from the root to leaves of the tree,

according to the values of the attributes of the description of e.

At each node of a path, a membership degree for e is valued

depending on the value of e for the attribute presents in the

node and the fuzzy values that label vertices going out that

node. On a path, all the membership degrees valued from the

root to the leaf are aggregated thanks to a conjunctive operator

(typically, a t-norm). The membership degrees for e obtained

for the whole leaves of the FDT are aggregated thanks to a

disjunctive operator (typically, a t-conorm). That leads to value

a membership degree for e to belong to each class c according

to the FDT. Various pairs of t-norms and t-conorms can be

used to aggregate the membership degrees. The most classical

ones are the Zadeh’s operators (minimum, maxi mum), or the

Lukasiewicz operators. More details can be found in [15]. A

FDT can also be used as a crisp decision tree: the alpha-cuts of

level 0.5 of each fuzzy membership functions are used to

replace the fuzzy sets. Such crisp use of a FDT enables the tree

to produce a single class, non fuzzy, as result of classification

of an example.

B. Baseline prediction

In order to measure the improvement obtained by our

method, we need to define a distance measure and a baseline.

To assess the extent to which we can predict the energy

production of a solar panel, we calculate the average of the

absolute values of the differences between the predicted energy

and the observed energy for the proposed models.

To enrich the analysis we propose three baselines:

• Constant average prediction: we assume that the

average energy for a region and for a period of time

can be perfectly predicted, but is constant for all

period. To achieve this we compute, after the fact, the

average energy observed during the whole period.

Notice that this is an ideal point that cannot be

achieved, in real predictions conditions. Any constant

prediction will augment the proposed energy distance.

• Energy tomorrow equals the one produced of today:

this is a standard method used for time series and in

particular in weather forecast prediction.

• Pure weather forecast based prediction: we propose to

use the weather forecast as the predicted energy class.

This approach corresponds to the natural way we

would address the problem: “If today is going to be

sunny and on a sunny day we produce on average

energy E then today we should observe energy E.”

TABLE I. ENERGY BASED ON OBSER VED CURRENT CONDITIONS

Majority Weather

Apri l - July 2010

Nr of Days Watt-hr

Std

Deviation

Fair (day) 38 449.6 119.4

Partly cloudy (day) 12 396.9 136.3

Mostly cloudy (day) 8 261.6 114.8

Cloudy 14 149.1 96.2

Sho wers 5 70.2 122.6

Globally 77 342.6

V. DATA ANALYSIS

Between end of April and beginning of July 2010, we

collected data for 77 successive days. The average energy

produced per day was 342 watt-hour with standard deviation

178 W-h. Table I shows that roughly half the days are “fair”

and half are “cloudy” or “rainy”. As expected, “fair” days tend

to produce more energy than “partly cloudy”, which are better

than “mostly cloudy”, “cloudy”, and “shower” days in that

order. This conformity of semantic and energetic descriptions

gives us confidence that our model, and in particular the

majority aggregation process, are suitable. The variability of

the daily energy production is rather large, but more or less

constant for each category.

The accuracy of the weather prediction for the studied

period, using the standard set of categories, was of 60% (of

correct prediction at sunrise for the day). This surprisingly

small proportion can be explained by two phenomena: aversion

to risk in the prediction and mismatch of categories. In Table

II, which shows the number of forecast weather conditions, we

can observe a shift towards an increased number of rainy days

(predictions). We observed 5 “shower” days, but 37 “showers”

or “thunderstorms” predictions. This discrepancy may come

from the aversion to risk of the weather forecaster. In fact, if it

should rain for only an hour in day the weather forecast will be

“rainy day”. But our majority observation, suitable for the

energy prediction, would be sunny day, with consequent

category mismatch. Moreover, by comparing labels on Table I

and 2, we notice that the number and labeling of categories

differs in the two sets, thus more-or-less guaranteeing

mismatches. Labels appearing in the forecast do not appear in

the current weather observations. For instance there are no

“cloudy” predictions and no “sunny” forecasts. This reveals an

even more profound and structural problem: class boundaries

are fuzzy. In fact, if we predict “mostly cloudy” and we

observe “partly cloudy” it will be considered a mismatch. New

weather classes could be created by grouping labels, as for

instance “cloudy” with “partly cloudy” in an “overcast” class;

but preliminary work showed that the prediction accuracy does

not improve, because the descriptions then become too vague

or arbitrary.

TABLE II. ENERGY BASED ON FOREC ASTED CONDITIONS

Forecast at Sunrise

Apri l - July 2010

Nr of Days Watt-hr

Std

Deviation

Su nny 12 533.3 54.5

Fair (day) 10 489.2 82.2

Partly cloudy (day) 16 381.5 151.5

Mostly cloudy (day) 2 239 103

Sho wer s 8 98.8 109.6

Isolated

thunderstorms

7 380.3 79.7

Scattered

thunderstorms 19 242.8 135.4

Thu nde rstorms 3 146.7 134

Globally 77 342.6 94.4

Improved weather forecast based prediction: To increase

the prediction quality due to what is described above, the total

mismatches (no sunny day observation) were manually

matched to the closest class: sunny to fair, any thunderstorms

type to showers, etc.

VI. RESULTS

Table III shows the energy prediction difference. By

assuming that a solar panel produces more-or-less the same

(constant prediction, baseline) we observe an average

discrepancy of 152 W-hr compared with what is really

observed. If we use the naïve model that assumes that

tomorrow energy is equal to what was observed today, we

observe that difference predicted-observed is increased. This

proves that the energy tends to change rather quickly and that a

constant assumption is a good baseline not easy to beat.

If we focus our attention to the improved (with manual

match of fuzzy classes) method based only on the weather

forecast, we observe a reduction of 12% with respect to the

constant average estimation.

We used the fuzzy decision trees to predict the energy. In

this approach, we use the Salammbô software [15] to build a

FDT from the whole dataset. From a training set, the

Salammbô software provides us with a FDT with fuzzy set

values that label vertices going from a node associated with a

numerical attribute.

Numerical attributes are automatically discretized (as a

fuzzy partition) by means of the software, at each step of

selection of an attribute to build a node of the tree. Attributes to

build nodes of the FDT are selected by means of a

discrimination measure [18]. In this experiment, we use the

gradual discrimination measure introduced in [19]. The

predicted energy class has been discretized in 4 intervals, from

0 (0 to 180 W-hr) to 3 (greater than 500). The classification of

an example by means of the FDT provided a set of membership

degrees to each intervals that define the class. In order to obtain

the predicted energy of the example, median values of each

interval weighted by the corresponding membership degrees

are aggregated to provide a predicted energy.

The FDT constructed from the whole training set (77

examples) is composed of 38 paths, with a maxi mum of 7

nodes on a path, and an average number of 5.1 nodes on a

path. Some instances of paths are:

• If the majority weather at sunset is mostly cloudy, and if

the temperature max is lower1 than 20 then the predicted

energy ranges from 370 to 500 (class 2).

• If the majority weather at sunset is cloudy or showers, and

if the temperature min is greater2 than 9 and the weather

at sunrise is fair then the predicted energy ranges from

180 to 370 (class 1).

We recall that a path in a FDT is equivalent to a fuzzy rule:

premise of the rule is composed of the attribute values that

pertains to the path, and the conclusion of the rule is the value

of the class presents in the leaf of the path.

We investigate the validity of this approach by means of a

leave one out experiment with the whole collected data set.

Results are presented in Table III.

With a crisp use and a crisp output of the FDT, the FDT

products a single weather class as output. In that case, we can

observed (column “Crisp”) that the prediction is worse than the

baseline one.

The accuracy of energy prediction can be further improved

by taking into account the fuzzy classification provided by the

FDT. The use of FDT with either min-max t -norms or

Lukasiewicz tnor ms to aggregate the membership to the

vertices on paths from the root to the leaves (see [15]) provides

an important improvement of the prediction. The min-ma x

weighting scheme provides excellent results reaching a 33%

improvement compared to the baseline, with an average energy

difference of 106 W-hr. Good results are also obtained by

means of the Lukasiewicz weighting scheme that provides a

26% improvement compared to the baseline, with an average

energy difference of 112 W-hr.

1 Lower than 20 is a fuzzy set deduced automatically during the construction

of t he FD T. It is a piecewise linear membership functio n with a support

equals to (-∞, 21] and a kernel equals to (-∞, 19].

2 Greater than 9 is a fuz zy s et deduced automatically during the construction

of the FDT. It is a piecewise linear members hip function with a support

equals to [7, +∞) and a kernel equals to [11, +∞).

TABLE III. AVER AGE ENERGY DIFFERENCE BET WEEN THE DIFFERENT

PREDICTION MODELS, COMPARED TO BASELINE (BEST CONSTANT PREDICTION)

Predictio n Models Comparison

Fuzzy Decision Trees

Best

Consta nt

(baseline)

Today

equals

Tom orrow

Improved

Weat her

Fore cast

Cri sp Min-max

Lukasi-

ewicz

norms

Average

Ener gy

Diffe rence

(watt -hr)

152 170 134 223 106 112

Com pare d

to ba sel ine -- worse -12% wo rse -30% -26%

VII. CONCLUSIONS AND F UTURE WOR K

The use of the weather forecast service allows improving

the energy production prediction. It not only improves

compared to any fixed prediction (based on average of other

studies), but also compared to a naïve sequential approach.

Since the weather forecast is wrong forty percent of the

time - based on the predictor's own categories – it is necessary

either to manually add coherence by realigning the fuzzy

categories or use a machine learning algorithm (as here the

fuzzy decision trees) to automatically discover the underlying

rules. These rules can be used in a second step to setup efficient

controllers, as for instance fuzzy Takagi Sugeno ones. But it is

important to point out that without such a study, any controller

would perform poorly, due to complex relationship existing

between weather class, weather forecast and energy production.

One of future works should focus on comparing the

performance of this approach with other regression algorithms,

such as neuronal networks – although the problem of the

symbolic weather classes remain a challenge. The interest will

be, not only to compare the performance with a dedicated

blackbox, but also, on addressing the challenge of

incorporating knowledge in these types of systems, improving

the overall performance. Another potential possibility is to test

prediction techniques that include temporal evolution, as for

example Markov models. Improved prediction models could

take advantage of available data sources not incorporated into

this first attempt at analysis, e.g., the recorded images of the

sky and the locally measured reported temperature: allowing to

correct national versus local measurement bias.

Other future work might focus on more complex but more

practical setups, for instance, sun-tracking panels, integration

with storage batteries, etc. We believe that sun tracking will not

dramatically change the conclusions of this work; though of

course it will improve absolute collection efficiency. Storage

batteries are obviously advantageous in that they give the

system designer control over several time scales that are

otherwise only in nature's hands, but with these additional

handles comes additional complexity and uncertainty.

REFERENC ES

[1] Yahoo! Weather RSS Feed [Online]. Available:

http://developer.yahoo.com/weather/ (accessed: 2010, Jan)

[2] Dropbox Docu mentation [Online]. Available:

https://www.dropbox.com/about (accessed: 2010, Jan)

[3] Arduino Duemilanuove Datasheet [Online]. Available:

http://www.arduino.cc/en/Main/Arduino BoardDuemilanove (accessed:

2010, Jan)

[4] Peder Bacher, Henrik Madse n, Henrik Aalborg Nielson, “Online short-

term solar power forecasting”, Informatics and Mathematical Modelling,

Richard Pedersens Plads , Technical University of Denmark, Denmar k,

22 May 2009.

[5] Lin Phyo Naing Srinivasan, D., “Estimation of solar power generating

capacity”, IEEE 11th International Confere nce on Probabilistic Methods

Applied to Power Systems (PMAPS), 14-17 June 2010, Singapore

[6] Hong-Tzer Ya ng, Jia n-Ta ng Liao, Xiang-He Su , “A fu zzy -rule based

power restoration approach for a distribution system with renewable

energies”, FUZZ-IEEE 2011: 2448-2453

[7] Davide Caputo, Francesco Grimaccia , Marco Mussetta, Riccardo Enrico

Zic h, “Photovoltaic plants predictive model by means of ANN trained

by a hybrid evolutionary algorithm”, IJCNN 2010: 1-6

[8] Francesco Grimaccia, Marco Mussetta, Riccardo Enrico Zich, “Neuro-

fuzzy predictive model for PV energy p roduction based on weather

forecast”, F UZZ-IEEE 2011: 2454-2457

[9] Irwa n Purnama, Y u-Ka ng Lo, Hua ng-Jen Chiu, “A fu zzy cont rol

maximum power point tracking photovoltaic system”, FUZZ-IEEE

2011: 2432-2439

[10] Esram, T., Chapman, P.L., "Comparison of Photovoltaic Array

Maximum Power Point Tracking Techniques", IEEE Transactions on

Ene rgy Co nversi on, Vol. 22 (2) pp. 439-449, 2007

[11] Chu ng -Yuen Won, Duk-Heon Kim, Sei-C han Kim, Wo n-Sam Kim and

Hack-Sung Kim, "A new maximum power point tracker of photovoltaic

arrays using fuzzy controller", 25th Annual IEEE Power Electronics

Specialists Conf. (PESC'94), pp. 396-403, Taipei, Taiwan, Jun 1994.

[12] Kyohei Kurohane, To monobu Senjyu, Atsus hi Yona, Naomitsu Urasaki,

Tomonori Goya, Tos hihisa Funabashi: A Hybrid Smart AC/DC P ower

System. IEEE Trans. Smart Grid 1(2): 199 -204 (2010)

[13] Yuan, Y. & Shaw, M. Induction of Fuzzy Decision Trees Fuzzy Sets a nd

systems, 1995, 69, 125-139.

[14] Janikow, C. Z. Fuzzy Decision Trees: Issues and Methods IEEE

Transactions on Systems, Man and Cybernetics, 1998, 28, 1-14.

[15] Marsala, C. & Bouchon-Meunier, B. An Adaptable System to Co nstruct

Fuzzy Decision Trees Proc. of the NAFIPS'99, 1999, 223-227.

[16] Quinlan, J. R. Induction of Decision Trees Machine Lear ning, 1986, 1,

86-106.

[17] Breiman, L.; Friedman, J.; Olshen, R. & Stone, C. Classificatio n And

Regression Trees Chapman and Hall, 1984.

[18] Marsala, C. & Bouchon-Meunier, B. Ranking Attrib utes to Build Fu zzy

Decision Trees: a Comparative Study of Measures IEEE World

Congress on Computational Intelligence, 2006, 1777-1783.

[19] Marsala, C. Gradual Fuzzy Decision Trees to Help Medical Diagnosis.

IEEE World Co ngress on Computational Intelligence, 2012, Brisba ne,

Australia, June 2012 (to appear)

Impact of weather variables on green energy production in India -A study of 'SAI MITRA', a multi-capacity solar energy generation system

Conference Paper

Full-text available

Feb 2024

The importance of green energy in the current times cannot be understated. The ill effects of traditional forms of energy generation, make solar energy one of the most environmentally friendly alternatives. SAI MITHRA-a multi-capacity solar energy generation system, executed by Sri Sathya Sai Central Trust at Prasanthinilayam in South India is a prime example for promotion of green energy. This study attempts to understand the impact of weather variables on solar energy production across different production capacities using high frequency daily data. The study identifies the important weather variables that have an impact on solar energy production during different seasons. In order to provide predictive insights, the impact of weather variables with t-1 and t-2 day lags on solar energy generation have also been studied. The insights from the paper are relevant for multi-capacity solar energy systems for improving operational efficiencies and promoting green energy ecosystems. Introduction:

Country-wide solar power load profile for Germany 2015 to 2019: The impact of system curtailments on prediction models

Article

Oct 2022
ENERG CONVERS MANAGE

David A. Wood

A large (43824 h) country-wide solar power load profile (LP), solar irradiance and meteorological dataset (ten variables) for Germany covering years 2015 to 2019 is compiled and forecast with eight machine learning and deep learning (ML/DL) algorithms. Analysis reveals that system curtailments are likely responsible for some outlying predictions. The adaptive boosting (ADA) and random forest (RF) algorithms outperform convolutional neural networks and other algorithms in supervised prediction and forecasting tasks with the dataset. Once tuned with 2015 to 2018 data, ADA and RF forecast 2019 hourly data with root mean squared error (RMSE) of < 0.03. Similar forecasting accuracy is achieved using smaller datasets of three months of historical data from 2015 to 2018 to forecast hourly LP in each month of 2019. Forecasts for July 2019 are associated with the highest errors (RMSE = 0.41). The transparent open box (TOB) algorithm, because it can reveal details of data matching contributions to each of its forecasts, is used to data mine the July 2019 forecasts and conduct in-depth outlier analysis. It reveals nine extreme outliers that each overestimate LP based on irradiance and meteorological inputs. Analysis suggests that substantial overestimates leading to high RMSE for July 2019 are likely due to system curtailments. It is beneficial to combine less transparent (e.g., ADA and RF) with more transparent (e.g. TOB) ML algorithms to accurately forecast and data mine large solar power data sets. It also indicates that solar LP forecasts cannot always rely on irradiance and meteorological variables in isolation. There is a need to be mindful of system constraints and market conditions when predicting LP on a country-wide basis.

EggBlock: Design and Implementation of Solar Energy Generation and Trading Platform in Edge-Based IoT Systems with Blockchain

Article

Full-text available

Mar 2022
SENSORS-BASEL

In this paper, to balance power supplement from the solar energy’s intermittent and unpredictable generation, we design a solar energy generation and trading platform (EggBlock) using Internet of Things (IoT) systems and blockchain technique. Without a centralized broker, the proposed EggBlock platform can promote energy trading between users equipped with solar panels, and balance demand and generation. By applying the second price sealed-bid auction, which is one of the suitable pricing mechanisms in the blockchain technique, it is possible to derive truthful bidding of market participants according to their utility function and induce the proceed transaction. Furthermore, for efficient generation of solar energy, EggBlock proposes a Q-learning-based dynamic panel control mechanism. Specifically, we set the instantaneous direction of the solar panel and the amount of power generation as the state and reward, respectively. The angle of the panel to be moved becomes an action at the next time step. Then, we continuously update the Q-table using transfer learning, which can cope with recent changes in the surrounding environment or weather. We implement the proposed EggBlock platform using Ethereum’s smart contract for reliable transactions. At the end of the paper, measurement-based experiments show that the proposed EggBlock achieves reliable and transparent energy trading on the blockchain and converges to the optimal direction with short iterations. Finally, the results of the study show that an average energy generation gain of 35% is obtained.

Near-term, national solar capacity factor forecasts aided by trend attributes and artificial intelligence

Article

Mar 2022

David A. Wood

An attribute technique is applied to forecast countrywide solar capacity. Attributes relate to the prior 12 h of a univariate, hourly time series. The approach avoids uncertainties relating to weather-related variables averaged at the country level. It captures impacts of system curtailments due to abnormal market conditions or grid-offtake limitations. Fifteen attributes relating to each hourly record are input to machine/deep learning (ML/DL) models. 43,824 h of solar capacity factor for Britain from 2015 to 2019 is evaluated. Fifteen ML/DL models are trained with 2015–2018 data with cross-validation. Trained models are then applied to forecast unseen 2019 hourly data. The ML/DL model forecast accuracy is compared with that of ARIMA and regression models. Extreme gradient boosting, random forest and adaptive boosting models outperform ARIMA and regression methods in forecasts for hours t0 to t + 12. Those three ML models are more accurate and faster to execute than six DL models evaluated. Suboptimal convergence and/or overfitting hinder the forecasts of DL models with unseen data. A transparent multi-linear regression model is used to identifying attribute influences on the different time period forecasts. The trend attributes are shown to influence the forecasts for different hours ahead in distinct ways.

Solar tracking control systems design strategies: A review

Conference Paper

Jan 2024

Smart solar energy management system based on weather data using IoT

Conference Paper

Jan 2023

Implementation of Artificial Intelligence Methods for Solar Energy Prediction

Chapter

Mar 2023

In order to satisfy the growing world energy demand and decreasing the emission of greenhouse gases the requirement for more green energy has led to an increased focus on research related to forecasting solar energy recently. In this study we aim to develop forecast models, based on Artificial Neural Network and Random Forrest algorithms to predict daily solar energy based on daily historical meteorological data measured between 2019 and 2021. The accuracy and the performance of each model are compared using mean squared error, mean absolute percentage error, mean absolute error, max error and R-squared for evaluation. The prediction of daily solar energy from the daily maximum, minimum and average values of the metrological variables using Artificial Neural Network and Random Forest was carried out. The results obtained indicate that both models can predict daily solar energy with good accuracy (MAPE = 13%). On the one hand, the RF model showed excellent accuracy during the training phase (MAPE = 8%, R2 = 0.97), but it failed to show same results during the testing phase (MAPE = 13%, R2 = 0.79). On the other hand, the ANN was able to maintain the same results during training and testing (MAPE = 13%, R2 = 0.81).

Lie-Trotter and Strang-Marchuk Methods for Modeling the 1D-Transport with Reaction Equation

Chapter

Full-text available

Mar 2023

In this work, we will study the 1D transport of contaminants in a saturated porous medium which can be presented by different phenomena, such as advection, diffusion, and reaction. The system of the three equations linked together is considered a difficult system to solve since each equation has its stability and convergence condition. Therefore, our objective is to develop a new strategy that will allow us to solve this kind of problem and obtain more effective results, to compare the two procedures of this approach method and specify the best procedure for modeling this type of system. The method utilized here is the operator splitting method, which is a good method to solve these kinds of complicated models. The main idea behind this strategy is to split down a complex problem into smaller subsystems, known as division sub-problems, and solve each one individually using the appropriate numerical method. The effects of operator splitting methods on the solution of advection-diffusion-reaction are examined, within the context of this works two operator splitting methods, Lie-Trotter and Strang-Marchuk splitting methods were used and comparisons were made through various decomposition rate. Obtained results were compared with analytical solutions to the problems and available methods in the literature. It is seen that the Lie-Trotter splitting method has lower error norm values than the Strang-Marchuk splitting method. But, the Lie-Trotter splitting method produces accurate results for very small values of the numerical result for an application concerning the transport of a contaminant will be presented to enhance the value of our results, and prove the efficiency of the LTM (Lie-Trotter Method).

PROGNOES: Prediction of Harvestable Solar Energy Based on Sun Irradiation and Weather Conditions

Conference Paper

Jan 2023

Operation of a photovoltaic installation in the North under various cloudiness conditions

Article

Full-text available

Apr 2022

This study investigates the effect produced by various types of cloudiness on the functioning of a photovoltaic system in the central part of the Republic of Sakha (Yakutia). The electric power efficiency of the photovoltaic system under various cloudiness conditions was assessed using graphical interpretations, measuring and recording devices, as well as a description of the procedure for conducting experimental work. The average indicators of a decrease in the electric power efficiency of the photovoltaic system were determined using patterns for a certain type of cloudiness. A specific cloudiness type was identified by performing measurements and calculating illumination ranges, taking boundary conditions into account. These studies were carried out during the summer period of 2021 using the facilities of the mobile test site of the V.P. Larionov Institute of the Physical-Technical Problems of the North of Siberian Branch of the Russian Academy of Sciences located in the central part of the Republic of Sakha (Yakutia). Control parameters of alterations in the generating capacity of the photovoltaic system were obtained for 10 types of cloudiness. The obtained parameters can be used when modeling operational processes and performing engineering calculations of the operating modes for solar power plants. According to the results, during the operation of photovoltaic systems under various types of cloudiness, the decrease in the generating capacity of the installation can vary within 8–95% relative to the generating capacity indicator under clear weather. The obtained indicators of alterations in the generating capacity of a photovoltaic system under various cloudiness conditions can be applied for developing a methodology for assessing the effect of cloudiness and its types on the carrying capacity of solar beams falling on the photovoltaic panel surface, as well as to more accurately determine the energy potential of solar generation in a certain area.

Online Short-term Solar Power Forecasting

Chapter

Full-text available

Jan 2011
SOL ENERGY

This paper describes a new approach to online forecasting of power production from PV systems. The method is suited to online forecasting in many applications and in this paper it is used to predict hourly values of solar power for horizons of up to 36 h. The data used is 15-min observations of solar power from 21 PV systems located on rooftops in a small village in Denmark. The suggested method is a two-stage method where first a statistical normalization of the solar power is obtained using a clear sky model. The clear sky model is found using statistical smoothing techniques. Then forecasts of the normalized solar power are calculated using adaptive linear time series models. Both autoregressive (AR) and AR with exogenous input (ARX) models are evaluated, where the latter takes numerical weather predictions (NWPs) as input. The results indicate that for forecasts up to 2 h ahead the most important input is the available observations of solar power, while for longer horizons NWPs are the most important input. A root mean square error improvement of around 35% is achieved by the ARX model compared to a proposed reference model.

Gradual fuzzy decision trees to help medical diagnosis

Conference Paper

Jun 2012

Christophe Marsala

In this paper, we consider the problem of the construction of fuzzy decision trees when there exists a graduality between the values of attributes and values of the class. We propose a new measure, extended from the measure of classification ambiguity, that takes into account both discrimination power and graduality with regards to the class. To highlight the importance of that kinds of measures, Medical applications is presented in which often the values of the class are symbolic and ordered and in which the discovery of gradual links between descriptive attributes and the class are seek for.

Classification and Regression Trees (CART)

Book

Sep 1984

Induction of decision trees" Machine Learning

Article

Jan 1986

Ross Quinlan

Classification and Regression Trees

Article

Jan 1984

Ranking Attributes to Build Fuzzy Decision Trees: a Comparative Study of Measures

Conference Paper

Jan 2006

The construction of decision trees is an efficient tool for inductive learning, and fuzzy decision trees are particularly interesting because they enable the user to take into account imprecise descriptions of the cases, or heterogeneous values (symbolic, numerical, or fuzzy). However, since the method to construct a fuzzy decision tree is not unique, in this paper, a comparative study is presented to point out differences between three methods. This study focus on differences between methods when ranking attributes during the construction of a fuzzy decision tree. The aim is to enable the reader to understand what kind of fuzzy decision tree is obtained by each method.

Photovoltaic plants predictive model by means of ANN trained by a hybrid evolutionary algorithm

Conference Paper

Aug 2010

This paper introduces a hybrid evolutionary optimization algorithm as a tool for training an Artificial Neural Network used for production forecasting of solar energy PV plants. This hybrid technique is developed in order to exploit in the most effective way the uniqueness and peculiarities of two classical optimization approaches, Particle Swarm Optimization (PSO) and Genetic Algorithms (GA). This procedure essentially represent a bio-inspired heuristic search technique, which can be used to solve combinatorial optimization problems, modeled on the concepts of natural selection and evolution (GA), but also based on cultural and social behaviours derived from the analysis of the swarm intelligence and interaction among particles (PSO). Some simulation results are reported to highlight advantages and drawbacks of the proposed technique in order to suitably apply this algorithm to neural network applications in engineering problems.

Estimation of solar power generating capacity

Conference Paper

Jul 2010

Solar energy is one of the most promising renewable energy sources. In order to integrate this type of source into an existing power distribution system, system planners need an accurate model that predicts the availability of the generating capacity. Solar resources are known to exhibit a high variability in space and time due to the influence of other climatic factors such as cloud cover. The probability distribution of irradiance fluctuations is difficult to predict due to various uncertainties. For efficient conversion and utilization of the solar resource, the solar resource modelling is one of the most essential tools for proper development, planning, maintenance scheduling and pricing of solar energy system. This paper proposes the Mathematical and Neural Network Prediction models for estimation of solar radiation for Singapore. Meteorological and geographical data (latitude, longitude, altitude, month, mean sunshine duration, etc.) were used as inputs to the models. The estimated results are compared with the field data obtained from the pyranometer installed on the solar panel with a tilt of 15°. The relevance and performance of each model in Singapore's weather context is then evaluated using statistical tools, namely Mean Bias Error, Root Mean Squared Error and Mean Absolute Percentage Error. The results show that the correlation coefficients between the proposed model and the actual daily solar radiation were higher than 90%, thus suggesting a high reliability of the model for evaluation of solar radiation received in Singapore. These models can be used easily for estimation of solar radiation for preliminary design of solar applications.

A hybrid smart AC/DC power system

Article

Oct 2010

Recently, smart grids are attracting attention. Already, a smart grid based on an AC grid is proposed. However, no study on research is presented or published on a smart grid based on a dc grid. This paper presents an ac/dc hybrid smart power system. The proposed system has advantages of both dc and ac grids. The proposed power system consists of a wind generator and several controllable loads. The controllable loads have different capacities. Therefore, by applying power consumption control with the droop characteristic, the dc bus voltage is maintained within the acceptable range. As controllable loads, electric water heater and electric vehicle are assumed. Effectiveness of the proposed method is verified by numerical simulation results.

Induction of fuzzy decision trees

Article

Jan 1995
FUZZY SET SYST

Most decision tree induction methods used for extracting knowledge in classification problems do not deal with cognitive uncertainties such as vagueness and ambiguity associated with human thinking and perception. In this paper cognitive uncertainties involved in classification problems are explicitly represented, measured, and incorporated into the knowledge induction process. A fuzzy decision tree induction method, which is based on the reduction of classification ambiguity with fuzzy evidence, is developed. Fuzzy decision trees represent classification knowledge more naturally to the way of human thinking and are more robust in tolerating imprecise, conflict, and missing information.

Weather-based solar energy prediction

Abstract and Figures

Recommended publications

A Novel Fuzzy Genetic Annealing Classification Approach

New concepts for fuzzy partitioning, defuzzification and derivation of probabilistic fuzzy decision...

Learning from soft partitions of data: Reducing the variance

Fuzzy logic classification for the extraction of surface parameters in Alpine areas