ArticlePDF Available

On the Log-Normal Distribution of Network Traffic

March 2002
Physica D Nonlinear Phenomena 167:72-85

March 2002
167:72-85

DOI:10.1016/S0167-2789(02)00431-1

Authors:

Ioannis Antoniou

Aristotle University of Thessaloniki

Victor V. Ivanov

Joint Institute for Nuclear Research

Valery V. Ivanov

Joint Institute for Nuclear Research

Petr V Zrelov

Joint Institute for Nuclear Research

Adetailed analysis of traffic measurements shows that the aggregation of these measurements forms a statistical distribution, which is approximated with high accuracy by the log-normal distribution. The inter-arrival times and packet sizes, contributing to the formation of network traffic, can be considered as independent. Applying the wavelet transform to traffic measurements, we demonstrate the multiplicative character of traffic series. This result confirms that the scheme, developed by Kolmogorov [Dokl. Akad. Nauk SSSR 31 (1941) 99] for the homogeneous fragmentation of grains, applies also to network traffic.

Scheme of a data acquisition system.

…

Traffic measurements aggregated with different bin sizes: 0.1, 1 and 10 s.

…

Packet size distribution for traffic measurements.

…

Packet size distribution for traffic measurements aggregated with bin size 100 ms.

…

+14

Packet size distribution for traffic measurements aggregated with bin size 1 s: fitting curve corresponds to the function (1).

…

Figures - uploaded by Valery V. Ivanov

Content may be subject to copyright.

Content uploaded by Valery V. Ivanov

Content may be subject to copyright.

Physica D 167 (2002) 72–85

On the log-normal distribution of network trafﬁc

I. Antonioua,b,*, V.V. Ivanova,c, Valery V. Ivanovc,d, P.V. Zrelovc

aInternational Solvay Institutes for Physics and Chemistry, CP-231, ULB, Bd. du Triomphe,

1050 Brussels, Belgium

bDepartment of Mathematics, Aristotle University of Thessaloniki, 54006 Thessaloniki, Greece

cLaboratory of Information Technologies, Joint Institute for Nuclear Research, 141980 Dubna, Russia

dUniversity Scientiﬁc Center, Joint Institute for Nuclear Research, 141980 Dubna, Russia

Received 19 November 2001; accepted 14 March 2002

Communicated by H. Müller-Krumbhaar

Abstract

Adetailedanalysisoftrafﬁcmeasurementsshowsthattheaggregationof these measurements formsastatistical distribution,

whichisapproximatedwith high accuracy by the log-normal distribution.Theinter-arrivaltimes and packet sizes,contributing

to the formation of network trafﬁc, can be considered as independent. Applying the wavelet transform to trafﬁc measurements,

we demonstrate the multiplicative character of trafﬁc series. This result conﬁrms that the scheme, developed by Kolmogorov

[Dokl. Akad. Nauk SSSR 31 (1941) 99] for the homogeneous fragmentation of grains, applies also to network trafﬁc.

PACS: 47.20.Ky; 12.40.Ee

Keywords: Internet; Trafﬁc; Flow; Log-normal; Self-similarity; Wavelet

1. Introduction

Within the global framework of the information so-

ciety fast, reliable and safety data exchange between

local and terrestrial wide-area computer networks has

become a priority issue. Evidence of trafﬁc complex-

ity appears in many forms, such as the heavy tailed

distributions, long-range correlations, self-similarity

found in trafﬁc measurements at different time scales

[2–7]. The complexity revealed from the trafﬁc mea-

surements has led to the suggestion that the trafﬁc of

information cannot be analyzed within the framework

of available models [8,9]. Moreover, the performance

∗Corresponding author. Fax: +32-2-650-50-28.

E-mail address: iantonio@vub.ac.be (I. Antoniou).

of computer networks crucially depends on the trafﬁc

assessment.

In this context, a major challenge for the emerg-

ing high-speed integrated-services communication

networks is to elaborate a reliable model that can

realistically capture the salient features of network

trafﬁc. Such model may serve as the basis for

the development of methods and tools for quality

assessment, providing more efﬁcient control and

management of information ﬂow in the Internet

[10,11].

We develop here a background trafﬁc model, which

can be used for the analysis and control of network

trafﬁc. In our study we use the trafﬁc measurements

obtained at the input of the Dubna University [12] local

area network (LAN), which includes approximately

PII: S0167-2789(02)00431-1

I. Antoniou et al./ Physica D 167 (2002) 72–85 73

200–250 interconnected computers. We describe in

Section 2 the data acquisition system of this LAN re-

alized on the basis of a standard IBM PC. In Section

3we analyze the inﬂuence of the aggregation process

on the form of the statistical distribution of trafﬁc

ﬂows. In Section 4 we analyze the reasons of trafﬁc

log-normality using real trafﬁc measurements and

model data.

In Section 5, after recalling a “small paper” by

Kolmogorov [1] on the log-normal distribution, we

demonstrate, using wavelets, the multiplicative struc-

ture of trafﬁc measurements, which conﬁrms the

applicability of the Kolmogorov’s scheme [1] to the

network trafﬁc.

Fig. 1. Scheme of a data acquisition system.

2. Data acquisition system

Two protocols are used in the “Dubna” LAN.

The NetBEUI protocol is applied only for internal

exchanges, and the TCP/IP for external communica-

tions. The measurements of network trafﬁc have been

realized at the external side of the input lock of LAN.

The performance of the data acquisition system is

based on an open mode driver [13] (see Fig. 1).

Instandard conditions the network adapter ofa com-

puter is in a mode of detecting a carrying signal (main

harmonic 4–6MHz). After appearing in the cable bits

of the package preamble, the network adapter comes

to a mode of 1bit and 1byte synchronization with the

74 I. Antoniou et al./ Physica D 167 (2002) 72–85

transmitter and starts receiving ﬁrst bytes of the pack-

age heading. As soon as one succeeds in extracting

the MAC address of the shot receiver from the ﬁrst

bytes taken by the adapter, the network adapter com-

pares it to its own. In the case of negative result of

the comparison, the network adapter ceases to record

the shot’s bytes into its internal buffer and cleans

its contents and then waits until the next package

appears.

In order to provide conditions for receiving and

analysis of all the packages transmitted over the net-

work, it is necessary to move the adapter devices

to a free mode when all possible shots are recorded

in the buffer. This operation is executed through the

instructions of the NDIS driver.

The free mode driver records the accepted packages

in the preliminary capture buffer and displays the ﬂag

of receiving the package. Then the receiving package

module is activated and analysis of the margin of the

package’s type is carried out to extract TCP/IP pack-

ages from the whole stream.

After identiﬁcation it is possible to separate and

delete the data block as well as to record the head-

ers to the SQL server database. The recording is per-

formed together with the time data with a frequency up

to 10kHz. Although the recording is performed with

buffering, the mode of saving the packages’ headers

requires enormous server’s resources, as in this case

there is a permanent procedure of recording with small

portions to the hard disk. That is why this mode is

switched on if required at the management system’s

instruction.

The system also provides control over the exter-

nal trafﬁc of the LAN on the basis of controlling the

records in the router table. Initial information on the

legal IP addresses is saved in the database of the LAN

computers from which data on legal addresses are

loaded into the main memory array. The users which

do not participate in forming the external trafﬁc, are

not taken into account when calculating the number

of transferred and received bytes. In order to decrease

the number of sessions of recording the information

on the external trafﬁc in the database, a timer of load

out of the buffer and a timer of changing a current

date have been introduced into the system.

The recorded trafﬁc data correspond approximately

to 20h (1600000 records with a frequency up to

10kHz, which corresponds to 1ms bin size) of mea-

surements. The part of this series corresponding ap-

proximately to 1h of measurements and aggregated

with different bin sizes is presented in Fig. 2. The con-

tribution of the NetBEUI trafﬁc has been estimated

around one to six packages per second during daily

working hours. This is negligibly small compared to

the TCP/IP trafﬁc. In this connection, we may neglect

the inﬂuence of non-IP trafﬁc on the TCP/IP trafﬁc.

3. Aggregation of trafﬁc measurements

In order to reconstruct and identify the dynami-

cal system underlying the trafﬁc measurements, we

applied [14] a nonlinear time series analysis and a

feed-forward layered neural network to trafﬁc data.

We demonstrated that the corresponding dynamical

system has reliable values for the time lag and the

embedding dimension. This permitted us to success-

fully apply a feed-forward layered neural network for

the identiﬁcation and reconstruction of the dynamical

system. We found that the trained neural network re-

produces the statistical distribution of packet sizes of

trafﬁc measurements aggregated with 1s bin size. This

distribution looks like the log-normal distribution.

Having available trafﬁc data measured at high-

frequency (each arriving packet has been recorded

independently, see Section 2), we obtained the pos-

sibility to analyze the inﬂuence of the aggregation

bin on the form of the packet size distribution. Fig. 3

shows the packet size distribution for raw trafﬁc mea-

surements, while Figs. 4–6 present the distributions

for measurements aggregated with bin sizes 10ms,

100ms and 1s, respectively.

One can clearly see that for the aggregation with

small bin sizes the packet size distributions have rather

chaotic and nonsystematic character. However, when

the aggregation bin size approaches 1 s (seeFig. 6) the

distribution assumes a stable form that does not change

with further increase of the aggregation bin (see, for

example, Fig. 7) corresponding to the aggregation with

the bin size 10s.

I. Antoniou et al./ Physica D 167 (2002) 72–85 75

Fig. 2. Trafﬁc measurements aggregated with different bin sizes: 0.1, 1 and 10 s.

Fig. 3. Packet size distribution for trafﬁc measurements.

The distributions in Figs. 6 and 7 are well approxi-

mated by the log-normal function [15]:

f(x)=A

√2πσ

xexp −1

2σ2(lnx−µ)2,(1)

where xis the variable, σand µthe parameters of

log-normal distribution and Ais the normalizing mul-

tiplier.

The ﬁtting procedure was realized with the help

of the MINUIT package [16] in the frame of the

well-known Physical Analysis Workstation (PAW),

see details in [17]. The MINUIT package is con-

ceived as a tool to ﬁnd the minimum value of a

multi-parameter function and to analyze the shape of

the function around the minimum [16]. The principal

application is foreseen for statistical analysis, working

on chi-square or log-likelihood functions, to com-

pute the best-ﬁt parameter values and uncertainties,

76 I. Antoniou et al./ Physica D 167 (2002) 72–85

Fig. 4. Packet size distribution for trafﬁc measurements aggregated

with bin size 10ms.

Fig. 5. Packet size distribution for trafﬁc measurements aggregated

with bin size 100ms.

including correlations between the parameters. It is

especially suited to handle difﬁcult problems.

In Table 1 we present the results of ﬁtting of the

packet size distributions aggregated with different bin

sizes with the help of the log-normal distribution (1).

Fig. 6. Packet size distribution for trafﬁc measurements aggregated

with bin size 1s: ﬁtting curve corresponds to the function (1).

Fig. 7. Packet size distribution for trafﬁc measurements aggregated

with bin size 10s: ﬁtting curve corresponds to the function (1).

Here the calculated value of the minimized function

FCN, usually deﬁned as χ2[16]:

χ2(a) =



i=1

[xi−yi(a)2]

,(2)

I. Antoniou et al./ Physica D 167 (2002) 72–85 77

Table 1

Results of ﬁtting of the packet size distributions aggregated with

different bin sizes by the function (1)

Bin

(s) σµA(×108)νχ

1 0.927 ±0.003 8.85 ±0.01 1.04 ±0.01 47 1447

2 0.896 ±0.004 9.62 ±0.01 1.25 ±0.01 47 987

3 0.906 ±0.005 10.06 ±0.01 1.14 ±0.01 47 667

4 0.891 ±0.006 10.40 ±0.01 1.15 ±0.01 47 667

5 0.888 ±0.006 10.62 ±0.01 1.19 ±0.01 47 348

10 0.843 ±0.007 11.36 ±0.01 1.33 ±0.02 46 225

Table 2

Results of ﬁtting of the packet size distributions aggregated with

different bin sizes by the function (3)

Bin

(s) σ(×104)µB(×108)νχ

1 0.92 ±0.01 1.476 ±0.007 1.13 ±0.01 47 8242

2 1.89 ±0.01 1.489 ±0.010 1.09 ±0.01 47 5085

3 2.90 ±0.03 1.500 ±0.013 0.99 ±0.01 47 3459

4 4.32 ±0.06 1.432 ±0.018 0.99 ±0.01 47 2771

5 5.65 ±0.10 1.400 ±0.019 1.02 ±0.01 47 2167

10 13.01 ±0.25 1.390 ±0.020 1.13 ±0.02 46 1151

where ais the parameter vector, e2

i=xithe square of

the error on the individual observations, and Nis the

number of channels in the ﬁtting histogram.

One can see that the ﬁtting curves corresponding to

the log-normal distribution approximate experimental

distributions with a reliable accuracy on all regions of

the analyzed distributions.

Feldmann and Whitt [18] noticed that Internet traf-

ﬁc measurements can be well described by the Pareto

and Weibull distributions. We checked, therefore, the

correspondence of our experimentally obtained distri-

butions to the Weibull distributions [15]:

f(x)=Bη

σx

ση−1exp −x

ση,x≥0,(3)

which for the deﬁnitely chosen of parameters, σand η,

has the form similar to the analyzed real distributions.

Here Bis the normalizing multiplier.

In Table 2 we present the results of ﬁtting of the

packet size distributions aggregated with different bin

sizes with the help of the Weibull distribution (3). One

can see that the corresponding ﬁtting curves approxi-

mate the experimental distributions signiﬁcantly worse

Fig. 8. Packet size distribution for trafﬁc measurements aggregated

with bin size 1s: ﬁtting curve corresponds to the function (3).

Fig. 9. Packet size distribution for trafﬁc measurements aggregated

with bin size 10s: ﬁtting curve corresponds to the function (3).

compared to the results obtained for the log-normal

distribution (see Figs. 8 and 9).

As we mentioned above, the ﬁtting curves corre-

sponding to the log-normal distribution approximate

experimental distributions with a reliable accuracy on

78 I. Antoniou et al./ Physica D 167 (2002) 72–85

Table 3

Results of ﬁtting of daily part of packet size distributions aggre-

gated with different bin sizes by the function (1)

Bin (s) νχ

2α(%)

1 47 49.84 32.30

2 47 44.76 52.51

3 47 41.53 65.98

all regions of the analyzed distributions. However, as it

can be seen from Table 1, they did not pass the χ2-test.

The main reason is that the distributions presented

in Figs. 6–9 are based on the whole set of data, which

corresponds approximately to 20h of measurements.

But the trafﬁc series, as well as corresponding statisti-

cal distributions, behave differently depending, if the

measurements were done during working hours or not.

In this connection, we tested the correspondence of

experimental distributions to the null-hypothesis (1)

applying the χ2goodness-of-ﬁt criterion using part of

the daily trafﬁc, which is shown in Fig. 2. The results

of this analysis are presented in Table 3.

Here αis the probability (%) that the observed

chi-square will exceed the value χ2by chance even

for a correct model, see, for instance, [15,19]. These

results show that the hypothesis (1) can be accepted

with a high probability (see also Fig. 10). At the same

time it must be noted (see Figs. 6 and 7) that the inﬂu-

ence of the inactive period of LAN does not change

signiﬁcantly the fundamental form of the statistical

distribution of trafﬁc data.

We conclude, therefore, that

•the aggregation of trafﬁc measurements forms

(starting from some threshold value of the aggrega-

tion window) a statistical distribution, which does

not change its form with further increase of the

aggregation window;

•this distribution is approximated with high accuracy

by the log-normal distribution.

4. What are the reasons of the log-normality of

network trafﬁc?

Let us try to understand the reasons which may

cause the above observed statistical distribution. We

Fig. 10. Packet size distribution for daily trafﬁc measurements

aggregated with bin size 1s: ﬁtting curve corresponds to the

function (1).

know that this distribution is formed by two dynamical

processes, related to (1) the inter-arrival time intervals,

Tint, and (2) the packet sizes, Ps.

The statistical distributions corresponding to each

of these random variables are presented in Figs. 3

and 11.

The distribution of inter-arrival time intervals ver-

sus the packet sizes is presented in Fig. 12. This

distribution shows that the variables Tint and Pscan

be considered to be independent. In order to check

the correlation between variables Tint and Psstatisti-

cally, we applied the Pearson’s correlation coefﬁcient

and the Fisher’s z-transformation, see details in [19,

p. 636]. Both approaches gave the result near zero,

that shows that variables Tint and Pscan be considered

as uncorrelated.

The part (around 92% of intervals) of the distribu-

tion of inter-arrival time intervals for Tint <150ms,

superimposed by the ﬁtting curve corresponding to the

exponential distribution, is presented in Fig. 13.

At present there are various trafﬁc models based

on Poisson and Poisson-like processes (see [9,20] and

references therein). We decided, therefore, to verify

the reliability of these models using a model for the

I. Antoniou et al./ Physica D 167 (2002) 72–85 79

Fig. 11. Distribution of inter-arrival time intervals for trafﬁc mea-

surements.

generation of trafﬁc measurements based on the fol-

lowing assumptions (Model 1):

1. The inter-arrival times Tint follow the exponential

distribution with parameters shown in Fig. 13.

Fig. 12. Distribution of inter-arrival time intervals vs. packet sizes.

Fig. 13. Distribution of inter-arrival time intervals for intervals

<150ms.

2. The corresponding packet sizes Psfollow the sta-

tistical distribution presented in Fig. 3.

In order to generate the random variables Tint and

Ps, distributed according to any function FUNC (in our

case, the exponential distribution) or one-dimensional

distribution (histogram), we used the following sub-

routines from CERNLIB [21]:

1. FUNPRE, which calculates the percentiles of

FUNC(X) between Xlow and Xhigh using a combi-

nation of trapezoidal and Gaussian integration, and

FUNRAN, which generates random variables be-

tween Xlow and Xhigh with probabilities according

to FUNC(X).

2. HISRAN, which generates random variables be-

tween Xlow and Xhigh according to a one-dimen-

sional distribution, supplied by HISPRE in the

form of a histogram; a uniform random number

generated by RNDM is transformed to the user’s

distribution using the cumulative distribution con-

structed from the histogram.

According to the above described algorithm we gen-

erated a ﬁle of model data, which then was aggregated

with different sizes of the aggregation window, starting

80 I. Antoniou et al./ Physica D 167 (2002) 72–85

Fig. 14. Distribution of packet sizes aggregated with 1s bin:

Model 1.

from the window of 1s. Fig. 14 shows the distribution

of packet sizes aggregated with 1s.

We see that this distribution is very far from

the distribution presented in Fig. 6. Moreover, it

can be approximated with a reliable accuracy by

Gaussian distribution (see Fig. 14). Thus, our result

gives additional conﬁrmation on the inconsistency

of Poisson-like models. The counting of large val-

ues of inter-arrivals, Tint ≥150ms, and generation

of the time arrivals (using HISRAN) according to

one-dimensional histogram presented in Fig. 13 does

not change the shape of Fig. 14.

Thus, if we model the trafﬁc measurements by

superposing the dynamical processes, related to the

inter-arrival times Tint and the packet sizes Ps,wedo

not obtain the desired result.

In order to improve our model, we prepared a spe-

cial data ﬁle based on real trafﬁc measurements under

the following assumptions (Model 2):

1. The inter-arrival times are the same as in real trafﬁc

data.

2. All packet sizes are set equal to 1.

These data were again aggregated with different

sizes of the aggregation window. The aggregation of

Fig. 15. Distribution of packet rates aggregated with 1s bin:

Model 2.

this data ﬁle gives the values of packet rates. Figs. 15

and 16 show the distributions of packet rates for the

aggregation windows 1 and 10s, respectively, and they

clearly follow the log-normal form (1).

Although Figs. 15 and 16 are qualitatively satisfac-

tory, they do not match precisely the real distributions

Fig. 16. Distribution of packet rates aggregated with 10s bin:

Model 2.

I. Antoniou et al./ Physica D 167 (2002) 72–85 81

Fig. 17. Distribution of packet sizes aggregated with 1s bin:

Model 3.

in Figs. 6 and 7. In order to improve further our model,

we generated a new data ﬁle with the following as-

sumptions (Model 3):

1. The inter-arrival times are the same as in real trafﬁc

data.

2. The corresponding packet sizes follow the statisti-

cal distribution shown in Fig. 3.

These data were again aggregated with different

sizes of the aggregation window. Figs. 17 and 18 show

the packet size distributions, superimposed by the ﬁt-

ting curve corresponding to the function (1), for the

aggregation windows 1 and 10s, respectively. One can

clearly see that these distributions are quite close to the

distributions of real trafﬁc measurements presented in

Figs. 6 and 7.

The results of this section are summarized as fol-

lows:

•The variables Tint and Pscan be considered as in-

dependent.

•The dynamical process related to the inter-arrival

times plays the key role in the formation of the

observed packet sizes distribution.

•The independence of the variables Tint and Psand

the form of the packet size distribution permits to

Fig. 18. Distribution of packet sizes aggregated with 10s bin:

Model 3.

consider the dynamical process corresponding to

trafﬁc series as the superposition of independent

processes corresponding to the packets contributing

to the network trafﬁc (see, for example, Fig. 3).

5. Self-similarity and power law of trafﬁc

measurements

The log-normal distribution has been ﬁrst observed,

to our knowledge, by Lucas et al. [4] for the empirical

probability distributions of packet arrivals aggregated

at 100ms. Later they developed the background traf-

ﬁc model, or (M, P, S) model [22], which realistically

generated the aggregated trafﬁc ﬂows for a large cam-

pus network. The log-normal distributions for packet

arrivals have been observed at different stream scales

[22]. Similar inter-arrival time distributions for chan-

nel arrivals have been observed in cellular telephony

[23].

It has long been observed that in a large variety

of physical phenomena, where self-similar processes

take place, the logarithms of dynamical variables are

normally distributed. This holds for grain sizes in

crust fragmentation [24], for energy released in seis-

mic events [25,26], for the distribution of topographic

82 I. Antoniou et al./ Physica D 167 (2002) 72–85

contours, tree rings, leaves, rivers, see, for example,

[27].

The theoretical explanation of the appearance of the

log-normal distribution in nature was ﬁrst given, to

our knowledge, by Kolmogorov in 1941 in a “small

paper” [1] not well known in the Western literature.

Kolmogorov proposed a general scheme of a random

process of the homogeneous fragmentation or subdivi-

sion of grains, for which the distribution of logarithms

of the grain sizes comes arbitrary close to the Gaus-

sian distribution together with the inﬁnitely continued

fragmentation process.

A simpliﬁed explanation of Kolmogorov’s result

[25, p. 206] is the following. Suppose that we have

a big rock which crumbles into sand. If the environ-

mental stresses are the same whatever the size of the

rock, the probability that a given piece of rock is frag-

mented into nismaller rocks is independent of the

stage iof the fragmentation process. Therefore, if we

start out with a single rock (n0=1), in the next stage

we have n1smaller rocks, in the next stage each of

these smaller rocks is fragmented into n2still-smaller

rocks, and so on. As the niare independent random

variables, the number of grains at the kth stage of frag-

mentation must be

Nk=



i=1

ni=n1n2···nk,(4)

lnNk=



i=1

lnni.(5)

The grain sizes Skare inversely proportional to the

number of grains Nk. Applying a variant the central

limit theorem, Kolmogorov found that the logarithms

of the grain sizes are normally distributed [1], i.e. the

distribution of grain sizes is log-normal.

The basic feature of log-normality is the power law

or self-similarity. Let Xand Ybe two random variables.

Then if Xis log-normal and if

Y=aXd,(6)

Yis also log-normal. The parameter ais called

the scale factor and the exponent dis the fractal

dimension. Power laws such as (6) are known as

self-similarity relations. Conversely, if both Xand

Yare known to be log-normal, there must exist a

self-similarity relation, such as (6), between them.

Kolmogorov invoked this property to deduce that, if

the distribution of grain sizes of sand is log-normal,

so are the grain volumes and the fractions by weight

retained in sieves of different mesh size.

In [35] the wavelet transform has been applied to the

self-similar stochastic processes, which Kolmogorov

used in his theory of turbulence [28]. For such pro-

cesses, after suitable re-scaling, the wavelet transform

at a given position becomes a stationary random func-

tion of the logarithm of the scale argument in the

transform [35]. The re-scaling depends on the scaling

component.

Unfortunately, the approach of Vergassola and

Frisch [35] cannot be directly applied to network

trafﬁc measurements, because they have signiﬁcantly

more complex, multi-fractal structure [29–31].How-

ever, the wavelet transform, being a very powerful

technique for extracting speciﬁc information from

a given data [19,32,33], may provide additional in-

formation necessary for understanding the reason of

trafﬁc log-normality.

5.1. Wavelet analysis of trafﬁc measurements

The signal frepresenting the network trafﬁc data is

examined by the wavelet transform with the help of

the so-called wavelet ψwith zero average:

∞

−∞

ψ(t)dt=0.

The corresponding wavelet basis is

ψa,b(t) =1

√aψt−b

a,

where a∈[0,∞)and b∈(−∞,∞)are the scale

and translation parameters, respectively. The wavelet

transform W(a,b)offis (see, for example, [34]):

W(a,b) =1

√a∞

−∞

f(t)ψ∗t−b

adt. (7)

The parameter bshifts the wavelet so that local infor-

mation about fat time t=bis contained in W(a,b).

I. Antoniou et al./ Physica D 167 (2002) 72–85 83

Fig. 19. Shade plot of the CWT coefﬁcients for trafﬁc measurements aggregated with 1s window.

Fig. 20. Shade plot of the CWT coefﬁcients for model trafﬁc series (Model 2) aggregated with 1s window.

84 I. Antoniou et al./ Physica D 167 (2002) 72–85

The scale parameter acontrols the size of the region

of inﬂuence, for a→0 the wavelet transform ‘zooms’

in on t=b.

The wavelet transform permits to focus on localized

signal structures along with a zooming procedure that

progressively reduces the scale parameter. It has been

shown (see, for instance, [36]) that the local signal

regularity is characterized by the decay of the wavelet

transform amplitude across scales. Singularities and

edges are identiﬁed by following the wavelet transform

local maxima at ﬁne scales.

All these features appear in complex signals like

multi-fractals. The wavelet transform takes advan-

tage of multi-fractal self-similarities, in order to

compute the distribution of the singularities of the

signals.

In order to reveal the self-similarity of trafﬁc mea-

surements at different scales, we applied the contin-

uous wavelet transform (CWT) to trafﬁc measure-

ments aggregated with 1s window. Fig. 19 shows

the shade plot of the CWT, based on the biorthogo-

nal spline wavelets, of the time series analyzed. The

self-similar, multi-fractal character of trafﬁc measure-

ments is clearly shown in the tree-like fragmentation

structure. Similar behavior demonstrates the shade plot

corresponding to data generated in accordance with

Model 2 (see Fig. 20).

Figs. 19 and 20 clearly demonstrate the multiplica-

tive character of trafﬁc measurements. This result is in

agreement with formula (4) and conﬁrms the applica-

bility of the Kolmogorov’s scheme to the description

of network trafﬁc.

6. Conclusion

Self-similarity, multi-fractal behavior and long-

range dependence have been observed in a large

number of Internet trafﬁc measurements. Our work,

based on the detailed analysis of trafﬁc measure-

ments obtained at the input of a medium size LAN,

demonstrates that the reason of these effects may be a

simple aggregation of real data. In fact we show that

the aggregation of trafﬁc measurements forms (start-

ing from some threshold value of the aggregation

window, which provides necessary conditions for ful-

ﬁllment of the central limit theorem requirements)

a statistical distribution, which does not change its

form with further increase of the aggregation window.

This distribution is approximated with high accuracy

by the log-normal distribution. We found that the two

stochastic variables contributing to network trafﬁc,

namely the inter-arrival times Tint and packet sizes

Ps, can be considered as independent. In addition, the

form of the packet size distribution suggests that

the dynamical process underlying the trafﬁc series is

the superposition of few independent processes cor-

responding to the packets contributing to the network

trafﬁc. Applying the CWT to trafﬁc measurements,

we show that the information trafﬁc has multiplica-

tive character. This conﬁrms the applicability of the

Kolmogorov’s scheme [1] to the description of net-

work trafﬁc. This problem is under study, but it goes

out of scope of the current paper.

The above obtained results highlighted the statis-

tical character of the network trafﬁc. However, the

question concerning the log-normality of the trafﬁc

measurements is still open. The dynamical processes

underlying the inter-arrival time series need further

study.

Acknowledgements

We thank Yu. Kryukov for the preparation of trafﬁc

data. We are grateful to Prof. I. Prigogine and Prof.

V.G. Kadyshevsky for encouragement and support.

This work has been done in the frame of the study

contract no. 16263-2000-06 F1SC ISP BE between

the International Solvay Institutes for Physics and

Chemistry (Brussels, Belgium) and the Joint Research

Center of the European Commission (Ispra, Italy).

References

[1] A.N. Kolmogorov, Über das logarithmisch normale

Verteilungsgesetz der Dimensionen der Teilchen bei

Zerstückelung, Dokl. Akad. Nauk SSSR 31 (1941) 99–101.

[2] W.E. Leland, M.S. Taqqu, W. Willinger, D.V. Wilson, On

the self-similar nature of ethernet trafﬁc (extended version),

IEEE/ACM Trans. Network. 2 (1) (1994) 1–15.

I. Antoniou et al./ Physica D 167 (2002) 72–85 85

[3] W. Willinger, M.S. Taqqu, W.E. Leland, D.V. Wilson,

Self-similarity in high-speed packet trafﬁc: analysis and

modeling of ethernet trafﬁc measurements, Stat. Sci. 10 (1)

(1995) 67–85.

[4] M.T. Lucas, D.E. Wrege, B.J. Dempsey, A.C. Weaver,

Statistical characterization of wide-area self-similar network

trafﬁc, University of Virginia Technical Report CS97-04, 9

October 1996.

[5] M.E. Crovella, A. Bestavros, Self-similarity in world web

trafﬁc: evidence and possible causes, IEEE/ACM Trans.

Network. 5 (6) (1997) 835–846.

[6] V. Misra, W.-B. Gong, A hierarchical model for teletrafﬁc,

Department of Electrical and Computer Engineering,

University of Massachusetts, Amherst, MA, 1998.

[7] J.M. Peha, Protocols can make trafﬁc appear self-similar, in:

Proceedings of the 1997 IEEA/ACM/SCS Communication

Networks and Distributed Systems Modeling and Simulation

Conference, 1997.

[8] A. Erramilli, P. Pruthi, W. Willinger, Recent developments in

fractal trafﬁc modelling, in: Proceedings of the International

Teletrafﬁc Seminar, St. Petersburg, 26 June–2 July 1995.

[9] D.L. Jagerman, B. Melamed, W. Willinger, Stochastic

modeling of trafﬁc processes, Technical Report, 1996.

[10] T. Tuan, K. Park, Multiple time scale congestion control

for self-similar network trafﬁc, Network Systems Lab,

Department of Computer Sciences, Purdue University, West

Lafayette, IN, USA, Elsevier, Amsterdam, Preprint, submitted

for publication.

[11] S. Ata, M. Murata, H. Miyahara, Analysis of network trafﬁc

and its application to design of high-speed routers, IEICE

Trans. Inform. Syst. E83-D (2000) 988–995.

[12] The State University, Dubna, http://www.uni-dubna.ru.

[13] P.V. Vasiliev, V.V. Ivanov, V.V. Korenkov, Yu.A. Kryukov,

S.I. Kuptsov, System for acquisition, analysis and control of

network trafﬁc for the JINR local network segment: the Dubna

University example, JINR Communications, D11-2001-266,

JINR, Dubna, Russia, 2001.

[14] P. Akritas, P.G. Akishin, I. Antoniou, A.Yu. Bonushkina, I.

Drossinos, V.V. Ivanov, Yu.L. Kalinovsky, V.V. Korenkov, P.V.

Zrelov, Nonlinear analysis of network trafﬁc, Chaos, Solitons

and Fractals 14 (2002) 595–606.

[15] W.T. Eadie, D. Dryard, F.E. James, M. Roos, B. Sadoulet,

Statistical Methods in Experimental Physics, North-Holland,

Amsterdam, 1971.

[16] F. James, M. Roos, MINUIT—function minimization and

error analysis, CERN Program Library D506, 1988.

[17] R. Brun, O. Couet, C. Vandoni, P. Zanarini, PAW—Physics

Analysis Workstation, CERN Program Library Q121, 1989.

[18] A. Feldmann, W. Whitt, Fitting mixtures of exponentials to

long-tail distributions to analyze network performance models

(AT&T laboratory research), in: Proceedings of the IEEE

INFOCOM’97, Kobe, Japan, April 1997.

[19] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery,

Numerical Recipes in C: The Art of Scientiﬁc Computing,

2nd Edition, Cambridge University Press, Cambridge, 1992.

[20] V. Paxson, S. Floyd, Wide-area trafﬁc: the failure of Poisson

modeling, IEEE/ACM Trans. Network. 3 (3) (1995) 226–244.

[21] General Information, Program Library, CERN Computer

Centre, 1989.

[22] M.T. Lucas, B.J. Dempsey, D.E. Wrege, A.C. Waver,

(M, P, S)—an efﬁcient background trafﬁc model for

wide-area network simulation, Technical Report, Department

of Computer Science, University of Virginia, 1997.

[23] J.I. Sánchez, F. Barceló, J. Jordán, Inter-arrival time

distribution for channel arrivals in cellular telephony, in:

Proceedings of the Fifth International Workshop on Mobile

Multimedia Communication (MoMuc’98), 12–14 October

1998, Berlin.

[24] N.K. Razumovsky, On a distribution character of metals

contents in ore ﬁelds, Dokl. Akad. Nauk SSSR 28 (1940)

815–817 (in Russian).

[25] C. Lomnitz, Fundamentals of Earthquake Prediction, Wiley,

New York, 1994.

[26] V.I. Keilis-Borok, Symptoms of instability in a system of

earthquake-prone faults, Physica D 77 (1994) 193–199.

[27] J. Aitchison, J.A.C. Brown, The Lognormal Distribution,

Cambridge University Press, Cambridge, 1957, 176 pp.

[28] A.N. Kolmogorov, The local structure of the turbulence in

incompressible viscous ﬂuid for very large Reynolds numbers,

Dokl. Akad. Nauk SSSR 30 (1941) 301.

[29] M.S. Taqqu, V. Teverovsky, W. Willinger, Is network trafﬁc

self-similar or multifractal?, Fractal 5 (1997) 63–73.

[30] G. Taubes, Fractals reemerge in the new math of the Internet,

Science 281 (1998) 1947–1948.

[31] R.H. Riedi, M.S. Crouse, V.J. Riberio, R.G. Baraniuk, A

multifractal wavelet model with application to network trafﬁc,

IEEE Trans. Inform. Theory 45 (1999) 6992–10018.

[32] C.K. Chui, An Introduction to Wavelets, Academic Press,

New York, 1992, pp. 1–18.

[33] I. Daubechies, Wavelets, SIAM, Philadelphia, 1992.

[34] A.K. Louis, P. Maab, A. Rieder, Wavelets: Theory and

Applications, Wiley, 1997.

[35] M. Vergassola, U. Frisch, Wavelet transforms of self-similar

processes, Physica D 54 (1991) 58–64.

[36] S. Mallat, A Wavelet Tour of Signal Processing, Academic

Press, New York, 1999.

Analysis of the Features of the Network Traffic in the Trunk Channel: Zero Channel

Article

Full-text available

Aug 2021

Characterising driver heterogeneity within stochastic traffic simulation

Article

Full-text available

Sep 2022

Drivers’ heterogeneity and the broad range of vehicle characteristics on public roads are primarily responsible for the stochasticity observed in road traffic dynamics. Understanding the behavioural differences in drivers (human or automated systems) and reproducing observed behaviours in microsimulation has lately attracted significant attention. Calibration of car-following model parameters is the prevalent way to chracterize different driving behaviours. However, most car-following models do not realistically reproduce free-flow accelerations and therefore, model parameters are usually mainly the result of over-fitting with limited possibility to reproduce realistic drivers’ heterogeneity in simulation. To solve this problem, the present study proposes a novel framework to identify individual driver fingerprints based on their acceleration behaviours and reproduce them in microsimulation. The paper also discusses the unsuitability of vehicle acceleration to properly characterise alone the aggressiveness of a driver. A large experimental campaign and simulation results demonstrate the robustness of the proposed method.

Bandwidth Allocation Games

Preprint

Full-text available

Apr 2022

Internet providers often offer data plans that, for each user's monthly billing cycle, guarantee a fixed amount of data at high rates until a byte threshold is reached, at which point the user's data rate is throttled to a lower rate for the remainder of the cycle. In practice, the thresholds and rates of throttling can appear and may be somewhat arbitrary. In this paper, we evaluate the choice of threshold and rate as an optimization problem (regret minimization) and demonstrate that intuitive formulations of client regret, which preserve desirable fairness properties, lead to optimization problems that have tractably computable solutions. We begin by exploring the effectiveness of using thresholding mechanisms to modulate overall bandwidth consumption. Next, we separately consider the regret of heterogeneous users who are {\em streamers}, wishing to view content over a finite period of fixed rates, and users who are {\em file downloaders}, desiring a fixed amount of bandwidth per month at their highest obtainable rate. We extend our analysis to a game-theoretic setting where users can choose from a variety of plans that vary the cap on the unbounded-rate data, and demonstrate the convergence of the game. Our model provides a fresh perspective on a fair allocation of resources where the demand is higher than capacity, while focusing on the real-world phenomena of bandwidth throttling practiced by ISPs. We show how the solution to the optimization problem results in allocations that exhibit several desirable fairness properties among the users between whom the capacity must be partitioned.

Characterizing Driver Heterogeneity within Stochastic Traffic Simulation

Article

Jan 2022

A Reward-Based Fair Resource Allocation in EONs Considering Traffic Demand Behavior

Conference Paper

Full-text available

Jul 2021

Extreme events in globally coupled chaotic maps

Article

Full-text available

Sep 2021

Understanding and predicting uncertain things are the central themes of scientiﬁc evolution. Human beings revolve around these fears of uncertainties concerning various aspects like a global pandemic, health, ﬁnances, to name but a few. Dealing with this unavoidable part of life is far tougher due to the chaotic nature of these unpredictable activities. In the present article, we consider a global network of identical chaotic maps, which splits into two different clusters, despite the interaction between all nodes are uniform. The stability analysis of the spatially homogeneous chaotic solutions provides a critical coupling strength, before which we anticipate such partial synchronization. The distance between these two chaotic synchronized populations often deviates more than eight times of standard deviation from its long-term average. The probability density function of these highly deviated values ﬁts well with the Generalized Extreme Value distribution. Meanwhile, the distribution of recurrence time intervals between extreme events resembles the Weibull distribution. The existing literature helps us to characterize such events as extreme events using the signiﬁcant height. These extremely high ﬂuctuations are less frequent in terms of their occurrence. We determine numerically a range of coupling strength for these extremely large but recurrent events. On-off intermittency is the responsible mechanism underlying the formation of such extreme events. Besides understanding the generation of such extreme events and their statistical signature, we furnish forecasting these events using the powerful deep learning algorithms of an artiﬁcial recurrent neural network. This Long Short-Term Memory (LSTM) can offer handy one-step forecasting of these chaotic intermittent bursts. We also ensure the robustness of this forecasting model with two hundred hidden cells in each LSTM layer.

Investigation the Stochastic behaviour of the Traffic Flow: A Case Study of a Section of a Road

Article

May 2024

The stochastic behavior is one of the key for the current state of vehicles flow for the real time traffic behavior. This paper describe the study to investigate the stochastic behavior of real time traffic flow for a section of road using probability distribution fit over the section of road, the traffic data was collected for a week from 7:00 to 19:00 at the location Nawabshah Pakistan. The different distribution such as Normal, Lognormal, Weibull, Gamma, Exponential distribution was fit using MATLAB distribution fit by probability plot of traffic flow data. The same distribution was used for the goodness-of-fit tests by considering Kolmogorov-Smirnov, Kolmogorov-Smirnov modified, Anderson-Darling were used with p-values at 95% of confidence level and justification to accept the hypothesis test are accepted or rejects. The hypothesis accept for Normal, Weibull and Gamma distribution which accept the all hypothesis test and among these three accepted fit distribution the Normal probability distribution fit is most fitted distribution using the rank by p-value of the hypothesis tests. Keywords: Traffic flow, Goodness-of-fit, Probability Distributions, Nawabshah

Modeling Network Traffic and Exploring Distribution Fitting: A Case Study on Spotify

Conference Paper

Nov 2023

Log-Normal Distribution Modelling with Quantised Data

Conference Paper

Jun 2023

OLTP In Real Life: A Large-scale Study of Database Behavior in Modern Online Retail

Conference Paper

Nov 2021

Analysis of Network Traffic and Its Application to Design of High-Speed Routers

Article

Full-text available

May 2000

A rapid growth of the Internet and proliferation of new multimedia applications lead to demands of high speed and broadband network technologies. Routers are also necessary to follow up the growth of link bandwidths. From this reason, there have been many researches on high speed routers having switching capabilities. To have an expected effect, however, a control parameters set based on traffic characteristics are necessary. In this paper, we analyze the network traffic using the network traffic monitor and investigate the Internet traffic characteristics through a statistical analysis. We next show the application of our analytical results to parameter settings of high speed switching routers. Simulation results show that our approach makes highly utilized VC space and high performance in packet processing delay. We also show the effect of flow aggregation on MPLS. From our results, the flow aggregation has a great impact on the performance of MPLS.

An Introduction to Wavelets

Article

Full-text available

Nov 1992
Comput Phys

Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies

The Lognormal Distribution

Article

Jan 1959

Teubner Studienbücher Mathematik

Book

Jan 1998

Earthquakes and Earthquake Prediction

Chapter

Dec 1976
Dev Geotech Eng

The Lognormal Distribution

Article

Dec 1957

The Lognormal Distribution

Article

Apr 1958

Über das logarithmisch normale Verteilungsgesetz der Dimensionen der Teilchen bei Zerstückelung

Article

Jan 1941

A.N. Kolmogoroff

The Local Structure of Turbulence in Incompressible Viscous Fluid for Very Large Reynolds Numbers

Article

Jul 1991

A. N. Kolmogorov

§1. We shall denote by u α ( P ) = u α ( x 1 , x 2 , x 3 , t ), α = 1, 2, 3, the components of velocity at the moment t at the point with rectangular cartesian coordinates x 1 , x 2 , x 3 . In considering the turbulence it is natural to assume the components of the velocity u α ( P ) at every point P = ( x 1 , x 2 , x 3 , t ) of the considered domain G of the four-dimensional space ( x 1 , x 2 , x 3 , t ) are random variables in the sense of the theory of probabilities (cf. for this approach to the problem Millionshtchikov (1939) Denoting by Ᾱ the mathematical expectation of the random variable A we suppose that ῡ ² α and (d u α /d x β ) ² ― are finite and bounded in every bounded subdomain of the domain G .

Numerical Recipes in C—The Art of Scientific Computing

Article

Nov 1987

On the Log-Normal Distribution of Network Traffic

Abstract and Figures

Recommended publications

Locality-preserving discriminant analysis and Gaussian mixture models for spectral-spatial classific...

Self-similar Traffic Prediction Scheme Based on Wavelet Transform for Satellite Internet Services

Modeling time-delays for Internet robot

A novel Self-Similar Traffic Prediction Method Based onWavelet Transform for Satellite Internet