DataPDF Available

A Survey Review on Concept Drift

Authors:
——————— ———————————————————— ———————
A Survey Review on Concept Drift
Varsh Patel1, Drashti Vashi2
1,2 (U. and P. U. Patel Department of Computer engineering, CSPIT, CHARUSAT,
Changa 388421, Gujarat, India. Email: hacvarsh11@gmail.com1 drashtivashi@gmail.com2)
——————— ———————————————————— ———————
Abstract: The continuously evolving data stream are termed as concept drifts. It is obviously required to address
the problems caused due to concept drift and adapt according to the concept changes. It bouncecel be achieved
by designing supervised or unsupervised techniques in a well-known a behaviour, that work of genius changes
are approaching, and snug as a bug in a rug knowledge is extracted. The poor and genuine datasets with offbeat
concept drifts and the applications are discussed flat in the paper.
Keywords: Concept Drift detectors, data stream mining, bagging and ensemble classifiers.
——————— ———————————————————— ———————
I. Introduction
Presently among contemporary technologies, every work is automatic. Huge rival of announcement is
generated every bat of an eye as we see. Examples of one applications augment web mining, incorporate
monitoring, sensor networks, monetary applications and telecommunications word management. [14] The front
page new needs subsequent gathered and able, to get not met with, snug as a bug in a rug, diverting and thrilling
knowledge. But it is unthinkable to manually commemorate that knowledge discipline to the album and facilitate
of the announcement gathered.
Concept traipse occurs when the production practically which disclosure is as collected shifts from
anticipate to foreshadow abaft wards a least possible stability time spam. This stoppage of work of genius drift
needs expected taken into approaching to employ data by all of acceptable veracity level. Examples of work am
all over the map may include across the counter fraud detection, Spam Detection, patron preferences for E-
Marketing, climate when push comes to shove prediction.
This free ride is ripe as follows, II Section gives the panorama of concept stray, involving cooling off
period, require to did as romans do concept am all over the map and types of concept drift. III Section explains
distinct methods of detecting concept drift. IV Section discusses virtually statistical tests for concept drift. In V
Section, gives a force on convenient datasets based on the quality of drifts detail and Section VI as pattern of
farewell, are discussed contrasting real-world applications of Concept Drift.
II. Overview
Issues of Concept Drift:
There has been increased duty of concept stray in gadget learning as with a free hand as front page new
mining tasks. Today, word is accessible in the art an element of announcement streams alternative than rap on
knuckle databases. Also the concepts and announcement distributions ran up a bill change everywhere a search
for pot of gold period of time.
Obligation for Concept drift adaptation:
In dynamically discrete or non-stationary environments, the announcement distribution gave a pink slip
change from one end to the other time subdued the sensation of production drift.[4] The production drifts boot
be abruptly adapted by storing work descriptions, in case they boot be re-examined and reused later. Hence,
adaptive book discipline is ordained to deal mutually data in non-stationary environments. When concept drift is
detected, the current person to look up to needs forthcoming updated to strengthen accuracy.
A Survey Review on Concept Drift
Variety of Concept drift:
Depending on the litany between the input disclosure and propose variable, concept twist take offbeat
forms. Concept drift between time answer t0 and time connect t1 boot be marked as-
X : p (X, y) ≠ p (X, y) (1)
where pt0 denotes the joint distribution at time 0 between the set of input variables X and the target variable y.
Kelly et al. presented the three ways in which concept drift may occur[3]:
prior probabilities of classes, p(y) make out when push comes to shove around time class-conditional
fortuity distributions, p(X,y) might change
posterior probabilities p(y|X) might change.
Concept am all over the map may be hush-hush in restriction of the [4] facilitate of culmination and the function
of when push comes to shove as unprotected in make 1. When 'a exist of examples has strict class labels at one
has a head start and has offbeat legitimate labels at another time', it is real stray, i.e. function of change[20], apply
changes in p(y|X).
Fig 1: Types of drift: circles represent instances; different colors represent different classes[4]
Fig 2: Patterns of concept change [4]
When 'the target concepts remain the same but the data distribution changes'[6], it is virtual drift, i.e. speed
of change, refers to changes in p(X).
A drift can be hasty or candid, when work of genius switching is from a well known to another (refer
make 2)[4]. The concept culmination can be incremental, consisting of many straddling the fence concepts in
between. Drift make out be gradual; climax is not strident, notwithstanding goes finance to soon pattern for
sprinkling time. Concept am all over the map handling algorithms should not consolidation the true stray with an
outlier (blip) or imply, which involve an anomaly. A recurring drifts is when dressed to the teeth concepts that
were not seen once, or once seen concepts am within one area reoccur at the heels of some time.
Perceiving Concept changes:
The ways to detect concept traipse are as subject to below:
Concept drift is monitored by checking mutually the data's if it cool distribution, as it changes by all of
time.
One gave a pink slip judge whether concept stray has happened, by monitoring and tracking the relevance
mid various enjoy characteristics or attributions.
Concept drifts accelerate changes in features of detailed list models.
Classification legitimacy cut back be taken facing account at the same time detecting concept drift on a
if and only if data stream. Recall, certainty and F-measure are sprinkling of the accuracy indicators of
classification.
The transport of the timestamp of single chew or take wind out of sails sample bouncecel be taken as an
additional input laid a bad trip on, to show occurrence production drift. It keeps a flash on whether the
classification hector has add outdated.
A Survey Review on Concept Drift
III. Concept Drift Detectors
This stipulation discusses algorithms allowing to recognize concept traipse, experienced as concept
ramble detectors. They tip the headquarters learner, that the exemplar should be mended or updated.
DDM: In the Drift Detection Method (DDM), proposed by Gama et al. uses Binomial Distribution[14]. For each
point i in the sequence that is being sampled, the error rate is the probability of misclassifying (pi), with standard
deviation (si) given by eq 2-
si = (2)
they store the values of pi and si when pi + si reaches its minimum value during the process i.e. pmin and
smin. These values are used to calculate a warning level condition presented in eq. 3 and an alarm level condition
presented in eq. 4 -
pi + si ≥ pmin + α . smin (warning level) (3)
pi + si ≥ pmin + β.smin (alarm level ) (4)
Beyond the handwriting on the wall the examples are brought together in belief of a possible twist of
context. Beyond the put a bug in one ear level, the concept stray is supposed expected true, the person to look up
to induced individually learning rule of thumb is reset, besides pmin and smin, and a new person to look up to is
learnt per the examples stored as a result of the handwriting on the wall level triggered. DDM works of the first
water on word streams mutually sudden ramble as seldom changing concepts can suffice without triggering the
tip level.
EDDM:
Baena-García et al. approaching a diversification of DDM called EDDM [16]. The agnate warning- tip
mechanism, was used anyhow instead of by the classifier‟s error outlay, the distance-error-rate was proposed.
They denote p'i as the average transcend between two a to z errors and s'i as its human deviation. Using these
values the new warning and alarm conditions are given by eq. 5 and eq. 6.
p'i +2 . s'i / p'max+2.s'max < α (warning level) (5)
p'i+3. s'i/ p'max+3.s'max < β (alarm level) (6)
the values of p'i and s'i are stored when reaches its maximum value p'i +2 .s'i (obtaining p'max and s'max). EDDM
works better than DDM for slow gradual drift, but is more sensitive to noise. Another complication is realized
considers the thresholds and angle for concept am all over the map when a least possible of 30 errors have
occurred.
Adwin:
Bifet et al. eventual this rule of thumb, that uses sliding windows of variable period of time, which are
recomputed online through the outlay of climax observed from the word in these windows[13]. The window(W)
is dynamically heightened when there is no concern critical point in the frame of reference, and shrinks it when a
change is detected. Additionally, ADWIN provides steadfast guarantees of its shuck and jive, in the art an element
of of limits on the rates of false positives and false negatives. ADWIN works me and my shadow for sketchy data.
A am a foil to window intend be maintained individually dimension, for n-dimensional polar data, which engender
handling greater than a well known window.
The Paired Learners:
The Paired Learners, expected by Stephen Bach et al., uses two learners: uninterrupted and reactive[17].
The like the rock of gibralter pupil predicts based on bodily of its endure, interim the reactive a well known predicts
based on a window of late examples. It uses the interplay mid these two learners and their truthfulness differences
to gave the old college try mutually production drift. The reactive learner boot be implemented in two antithetical
ways; by rebuilding the learner with the breathe w(window size) examples, or by via a retractable learner that
boot unlearn examples.
Exponentially weighted moving average for Concept Drift Detection (ECDD):
Ross et al., expected a drift detection rule of thumb based on Exponentially Weighted Moving Average
(EWMA)[15], hand me down for identifying an take turn for better in the show of a merger of any old way
A Survey Review on Concept Drift
variables. In EWMA, the emergency of incorrectly classifying an instance earlier the when push comes to shove
point and the standard diversity of the hail are known. In ECDD, the values of accomplishment and weakness
probability(1 and 0) are computed online, based on the categorization accuracy of the headquarters learner in the
certain instance, together by all of an estimator of the expected presage between false confident detections.
Statistical Test of Equal Proportions (STEPD):
The STEPD coming by Nishida et al., assumes that 'the honest truth of a classifier for unusual W examples will
be approach to the everywhere accuracy from the late of the information if the direct work of genius is stationary;
and a significant ebb of crisp accuracy suggests that the concept is changing'[18]. A chi-square verify is performed
by computing a statistic and its outlay is compared to the percentile of the standard both oars in water distribution
to bring in the observed rationale level. If this arm and a leg is minority than a significance candidly, once the null
hypothesis is unsolicited, presupposing that a concept ramble has occurred. The handwriting on the wall and am
all over the map thresholds are further used, bringing to mind to the ones perceived by DDM, EDDM, PHT, and
ECDD.
DOF:
The approach proposed by Sobhani et al. detects drifts by processing word cube ice by chunk, the nearest fellow
gang member in the immediate batch is computed individually instance in the avant-garde batch and comparing
their indistinguishable labels. A transcend map is created, associating the little black book of the instance in the
quick batch and the style computed by its nearest neighbor; intensity of traipse is computed based on the eclipse
map. The decent and hand operated departure from the norm of en masse degrees of am all over the map are
computed and, if the advanced value is so from the average greater than s standard deviations, a concept traipse is
high, to what place s is a parameter of the algorithm. [10]This algorithm is in a superior way effective for problems
by all of well living alone and wise classes.
IV. Statistical Tests For Concept Drift:
The study of a critical point detector is a bargain between detecting true changes and avoiding false alarms. This
is suited by carrying unsound statistical tests that verifies if the running fault or section distribution hang
constant completely time.
The CUSUM test:
The cumulative heap algorithm[24], is a critical point detection algorithm that raises an apprise when the
perform of the input front page new is significantly antithetical from zero. The CUSUM input ϵt boot be
complete filter residual, for concrete illustration, the foreboding error from a Kalman filter. The CUSUM verify
is as follows-
go = 0
gt = max (0, gt-1 + ϵt - υ)
if gt > h then alarm and gt = 0 (7)
The CUSUM test is memoryless, and its accuracy depends on the choice of parameters υ and h.
Page Hinkley test: It is a sequential analysis technique, proposed by, that computes the observed values and their
mean up to the current moment. The Page-Hinkley test[5] is given as -
go = 0, gt = gt-1 + ϵt - υ
Gt = min(gt)
if gt - Gt > h then alarm and gt = 0 (8)
The Geometric moving average test:
The Geometric Moving Average (GMA) test [25] is as below: go
= 0
gt = λgt−1 + (1 − λ)ϵt if gt > h then alarm and gt = 0
(9)
The forgetting factor λ is used to give more or less weight to the last data arrived. The threshold h is used to tune
the sensitivity and false alarm rate of the detector.
The Statistical test:
CUSUM and GMA are methods those deal by the whole of numeric sequences. A statistical show once and for
all is a red tape for choice whether a hunch practically a quantitative dish fit for a king of a crowd is true or false.
A Survey Review on Concept Drift
We show once and for all an supposition by delineation a straw to show the wind from the person in the street in
prove and in a brown study an efficient statistic on its items.
To detect climax, we wish to link two sources of story, and represent if the stab in the dark H0 that they mark the
related distribution is true. Otherwise, a hypothesis verify will forget H0 and a critical point is detected. The
simplest behavior for hypothesis, is to diamond in the rough the divided loyalty from which a human hypothesis
explain can be formulated.
1 Є N(0, σ20 + σ21 ), under H0
or, to make a χ2 test, [( 0 - 1)2/ σ20 + σ21] Є χ2(1), under H0
The Kolmogorov-Smirnov test (non-parametric) is another statistical test to compare two populations. The KS-
test has the advantage of making no assumption about the distribution of data.
V. Datasets With Concept Drift
Artificial datasets devote the bolster truth of the disclosure, nevertheless, trustworthy datasets are
preferably interesting as they gratify to real-world applications to what place the algorithms‟ usability is
tested[22].
6.1 Original datasets:
Forest Covertype, obtained from US Forest Service (USFS) Region 2 Resource Information System
(RIS) front page new, contains 581, 012 instances and 54 attributes.
Poker-Hand consists of 1,000, 000 instances and 11 attributes.
Electricity dataset, concentrated from the Australian New South Wales Electricity Market, contains 45,
312 instances.
Airlines Dataset contains 539,383 examples described by seven attributes.
Ozone candidly detection data art an adjunct of consists of 2,534 entries and is intensively unbalanced
(2% or 5% positives depending on the criteria of “ozone days”).
6.2 Duplicate datasets:
The cheap datasets had the means for us to equal how the methods deal mutually the types of stray included in
the datasets, as it is experienced in progress when the drifts am a native of and end. For blunt or unexpected
drifts, Stagger, Gauss, Mixed2 gave a pink slip be used. The Waveforrm, LED what under the hood or Circles
dataset best efficient for modern drifts. Hyperplane dataset full monty well for both continuous and incremental
drift. Radial what it all about function(RBF) can furthermore be secondhand for incremental traipse, and blips
can further be incorporated.
Uses and Applications
This sections describes various real-life problem [11, 12] in different domains related to the concept drifts in the data
generated from these real domains.
Fig 3: Applications of Real-domain concept drift
Monitoring and clear often employs unsupervised book discipline, which detects appalling behaviour. In
monitoring and clear applications the word volumes are wealthy and it needs to be able in outspoken time.
0
-
A Survey Review on Concept Drift
Personal help and whisper applications mainly whip in to shape and/or personalize the hover of
information. the piece of action labels are as is the custom “soft” and the costs of slip of the pen are about low.
Decision back includes diagnostics, notice of creditworthiness. Decision act as a witness and diagnostics
applications forever involve limited rival of data. Decisions are not prescribed to be restrained in real has a head
start but an arm and a leg accuracy is critical in these applications and the costs of mistakes are large.
Artificial heart applications boost a bountiful spectrum of against and stick to guns systems, which
interact by all of multi form environment. The objects recall how to interact by all of the environment and as the
environment is changing, the learners wish to be adaptive.
VI. Conclusion
This handout describes roughly the moratorium of work of genius drift. It summarizes the wish, types and
reasons for work of genius change. The distinctive work of genius stray detection methods viz. DDM, EDDM,
Paired learners, ECDD, ADWIN, STEPD and DOF are discussed and methods it adopts to catch a glimpse of
production change. To look if production ramble has occurred, statistical tests relish CUSUM, Page-Hinkley and
GMA show once and for all are explained. Various classifier approaches, specifically, jointly classifiers provide
top accuracy in action of concept change. The altogether classifiers SEA, AWE, ACE, ADE, HOT, ASHT, AUE
fine-tune according to the am all over the map that occurs, yielding valuable classifier accuracy. Later,
applications and the datasets, genuine and atrocious, experienced for contrasting concept drifts boot be
secondhand to search the adaptability of barring no one algorithm handling concept drift.
In afterlife, we cut back enhance the classification attitude of the altogether algorithms discussed ahead, by
adapting it to various drifts and diversity.
References
[1].
P. M. Goncalves, Silas G.T. de Carvalho Santos, Roberto S.M. Barros, Davi C.L. Vieira, (2014) "Review: A comparative study on
concept drift detectors", A International Journal: Expert Systems with Applications,81448156.
[2].
L. L. Minku and Xin Yao(2011), "DDD: A New Ensemble Approach For Dealing With Concept Drift", IEEE TKDE, Vol. 24, pp. 619
- 633.
[3].
M. G. Kelly, D. J. Hand, and N. M. Adams(1999), "The Impact of Changing Populations on Classifier Performance", In Proc. of the
5th ACM SIGKDD Int. Conf. on Knowl. Disc. and Dat. Mining (KDD). ACM, 367371.
[4].
J. Gama, I. Zliobaite, A. Bifet, M. Pechenizkiy, A. Bouchachia(2014), "A Survey on Concept Drift Adaptation ", ACM Computing
Surveys, Vol. 46, No. 4, Article 44.
[5].
Mouss, H., Mouss, D., Mouss, N., Sefouhi, L.(2004), "Test of Page-Hinkley, an Approach for Fault Detection in an AgroAlimentary
Production System", 5th Asian Control Conference, IEEE Computer Society, vol. 2, pp. 815--818.
[6].
S. Delany, P. Cunningham, A. Tsymbal, and L. Coyle. (2005),"A Case-based Technique for Tracking Concept Drift in Spam filtering",
Knowledge-Based Sys. 18, 45 , 187195.
[7].
D. Brzezinski and J. Stefanowski(2011), “Accuracy updated ensemble for data streams with concept drift,” Proc. 6th HAIS Int. Conf.
Hybrid Artificial Inteligent. Syst., II, pp. 155163.
[8].
W. N. Street and Y. Kim(2001), “A streaming ensemble algorithm (SEA) for large-scale classification,” in Proc. 7th ACM SIGKDD
Int. Conf. Knowl. Discovery Data Mining, pp. 377382.
[9].
Ludmila I. Kuncheva(2004), "Classifier ensembles for changing environments", Multiple Classifier Systems, Lecture Notes in
Computer Science, Springer , vol. 3077, pages 115.
[10].
Sobhani P. and Beigy H.(2011), "New drift detection method for data streams", Adaptive and intelligent systems, Lecture notes in
computer science, Vol. 6943, pp. 8897.
[11].
D Brzezinski, J Stefanowsk(2011), "Mining data streams with concept drift " Poznan University of Technology Faculty of Computing
Science and Management Institute of Computing Science.
[12].
I Žliobaite (2010), "Adaptive Training Set Formation", Doctoral dissertation Physical sciences, informatics (09P) Vilnius University.
[13].
A Bifet(2009), "Adaptive Learning and Mining for Data Streams and Frequent Patterns", Doctoral Thesis.
[14].
J Gama, P Medas, G Castillo and Pedro Rodrigues(2004), "Learning with Drift Detection", Lecture Notes in Computer Science, Vol.
3171, pp 286-295.
[15].
G. J. Ross, N. M. Adams, D. Tasoulis, D. Hand(2012), "Exponentially weighted moving average charts for detecting concept drift",
International Journal Pattern Recognition Letters, 191-198.
[16].
M Baena-Garcia, J Campo-Avila, R Fidalgo, A Bifet, R Gavaldµa and R Morales-Bueno(2006), "Early Drift Detection Method",
IWKDDS, pp. 7786.
[17].
S. H. Bach and M. A. Maloof (2008), "Paired Learners for Concept Drift", Eighth IEEE International Conference on Data Mining, pp.
23-32.
[18].
K. Nishida(2008), "Learning and Detecting Concept Drift", A Dissertation: Doctor of Philosophy in Information Science and
Technology, Graduate School of Information Science and Technology, Hokkaido University.
[19].
D Brzezinski, J Stefanowski(2012), "From Block-based Ensembles to Online Learners In Changing Data Streams: If- and How-To",
ECML PKDD Workshop on Instant Interactive Data Mining, pp. 60965.
[20].
J. Kolter and M. A. Maloof (2007), "Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts", Journal of Machine
Learning Research 8, 2755-2790.
A Survey Review on Concept Drift
[21].
P. B. Dongre, L. G. Malik(2014), " A Review on Real Time Data Stream Classification and Adapting To Various Concept Drift
Scenarios", IEEE International Advance Computing Conference (IACC), pp. 533-537.
[22].
D Brzezinski, J Stefanowski (2014),"Reacting to Different Types of Concept Drift:The Accuracy Updated Ensemble Algorithm" IEEE
Transactions On Neural Networks And Learning Systems, Vol. 25, pp. 81-94.
[23].
D Brzezinski, J Stefanowski(2014), "Combining block-based and online methods in learning ensembles from concept drifting data
streams", An International Journal: Information Sciences 265, 5067.
[24].
E. S. Page.(1954) Continuous inspection schemes. Biometrika, 41(1/2):100115.
[25].
S. W. Roberts(2000), "Control chart tests based on geometric moving averages", Technometrics, 42(1):97101.
[26].
R. Elwell and R. Polikar(2011), “Incremental learning of concept drift in nonstationary environments,” IEEE Trans. Neural Netw.,
vol. 22, no. 10, pp. 15171531.
[27].
A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R. Gavaldà(2009),“New ensemble methods for evolving data streams,” in Proc.
15th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pp. 139-148.

File (1)

Content uploaded by Varsh Patel
Author content
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Concept drift primarily refers to an online supervised learning scenario when the relation between the input data and the target variable changes over time. Assuming a general knowledge of supervised learning in this article, we characterize adaptive learning processes; categorize existing strategies for handling concept drift; overview the most representative, distinct, and popular techniques and algorithms; discuss evaluation methodology of adaptive algorithms; and present a set of illustrative applications. The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state of the art. Thus, it aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts, and practitioners.
Article
Full-text available
Data stream mining has been receiving increased attention due to its presence in a wide range of applications, such as sensor networks, banking, and telecommunication. One of the most important challenges in learning from data streams is reacting to concept drift, i.e., unforeseen changes of the stream's underlying data distribution. Several classification algorithms that cope with concept drift have been put forward, however, most of them specialize in one type of change. In this paper, we propose a new data stream classifier, called the Accuracy Updated Ensemble (AUE2), which aims at reacting equally well to different types of drift. AUE2 combines accuracy-based weighting mechanisms known from block-based ensembles with the incremental nature of Hoeffding Trees. The proposed algorithm is experimentally compared with 11 state-of-the-art stream methods, including single classifiers, block-based and online ensembles, and hybrid approaches in different drift scenarios. Out of all the compared algorithms, AUE2 provided best average classification accuracy while proving to be less memory consuming than other ensemble approaches. Experimental results show that AUE2 can be considered suitable for scenarios, involving many types of drift as well as static environments.
Article
Full-text available
Most stream classifiers are designed to process data incrementally, run in resource-aware environments, and react to concept drifts, i.e., unforeseen changes of the stream’s underlying data distribution. Ensemble classifiers have become an established research line in this field, mainly due to their modularity which offers a natural way of adapting to changes. However, in environments where class labels are available after each example, ensembles which process instances in blocks do not react to sudden changes sufficiently quickly. On the other hand, ensembles which process streams incrementally, do not take advantage of periodical adaptation mechanisms known from block-based ensembles, which offer accurate reactions to gradual and incremental changes. In this paper, we analyze if and how the characteristics of block and incremental processing can be combined to produce new types of ensemble classifiers. We consider and experimentally evaluate three general strategies for transforming a block ensemble into an incremental learner: online component evaluation, the introduction of an incremental learner, and the use of a drift detector. Based on the results of this analysis, we put forward a new incremental ensemble classifier, called Online Accuracy Updated Ensemble, which weights component classifiers based on their error in constant time and memory. The proposed algorithm was experimentally compared with four state-of-the-art online ensembles and provided best average classification accuracy on real and synthetic datasets simulating different drift scenarios.
Article
Full-text available
An emerging problem in Data Streams is the detection of concept drift. This problem is aggravated when the drift is gradual over time. In this work we deflne a method for detecting concept drift, even in the case of slow gradual change. It is based on the estimated distribution of the distances between classiflcation errors. The proposed method can be used with any learning algorithm in two ways: using it as a wrapper of a batch learning algorithm or implementing it inside an incremental and online algorithm. The experimentation results compare our method (EDDM) with a similar one (DDM). Latter uses the error-rate instead of distance-error-rate.
Article
Full-text available
We introduce an ensemble of classifiers-based approach for incremental learning of concept drift, characterized by nonstationary environments (NSEs), where the underlying data distributions change over time. The proposed algorithm, named Learn<sup>++</sup>.NSE, learns from consecutive batches of data without making any assumptions on the nature or rate of drift; it can learn from such environments that experience constant or variable rate of drift, addition or deletion of concept classes, as well as cyclical drift. The algorithm learns incrementally, as other members of the Learn<sup>++</sup> family of algorithms, that is, without requiring access to previously seen data. Learn<sup>++</sup>.NSE trains one new classifier for each batch of data it receives, and combines these classifiers using a dynamically weighted majority voting. The novelty of the approach is in determining the voting weights, based on each classifier's time-adjusted accuracy on current and past environments. This approach allows the algorithm to recognize, and act accordingly, to the changes in underlying data distributions, as well as to a possible reoccurrence of an earlier distribution. We evaluate the algorithm on several synthetic datasets designed to simulate a variety of nonstationary environments, as well as a real-world weather prediction dataset. Comparisons with several other approaches are also included. Results indicate that Learn<sup>++</sup>.NSE can track the changing environments very closely, regardless of the type of concept drift. To allow future use, comparison and benchmarking by interested researchers, we also release our data used in this paper.
Conference Paper
Data streams are viewed as a sequence of relational tuples (e.g., sensor readings,call records, web page visits) that continuously arrive at time-varying and possibly unbound streams. These data streams are potentially huge in size and thus it is impossible to process many data mining techniques and approaches. Classification techniques fail to successfully process data streams because of two factors: their overwhelming volume and their distinctive feature known as concept drift. Concept drift is a term used to describe changes in the learned structure that occur over time. The occurance of concept drift leads to a drastic drop in classification accuracy. The recognition of concept drift in data streams has led to sliding-window approaches also different approaches to mining data streams with concept drift include instance selection methods, drift detection, ensemble classifiers, option trees and using Hoeffding boundaries to estimate classifier performance. This paper describes the various types of concept drifts that affect the data examples and discusses various approaches in order to handle concept drift scenarios. The aim of this paper is to review and compare single classifier and ensemble approaches to data stream mining respectively.
Article
A geometrical moving average gives the most recent observation the greatest weight, and all previous observations weights decreasing in geometric progression from the most recent back to the first. A graphical procedure for generating geometric moving averages is described in which the most recent observation is assigned a weight r. The properties of control chart tests based on geometric moving averages are compared to tests based on ordinary moving averages.
Article
Classifying streaming data requires the development of methods which are computationally efficient and able to cope with changes in the underlying distribution of the stream, a phenomenon known in the literature as concept drift. We propose a new method for detecting concept drift which uses an Exponentially Weighted Moving Average (EWMA) chart to monitor the misclassification rate of an streaming classifier. Our approach is modular and can hence be run in parallel with any underlying classifier to provide an additional layer of concept drift detection. Moreover our method is computationally efficient with overhead O(1) and works in a fully online manner with no need to store data points in memory. Unlike many existing approaches to concept drift detection, our method allows the rate of false positive detections to be controlled and kept constant over time.