Article

The Spectral Analysis of Point Processes

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The spectral analysis of stationary point processes in one dimension is developed in some detail as a statistical method of analysis. The asymptotic sampling theory previously established by the author for a class of doubly stochastic Poisson processes is shown to apply also for a class of clustering processes, the spectra of which are contrasted with those of renewal processes. The analysis is given for two illustrative examples, one an artificial Poisson process, the other of some traffic data. In addition to testing the fit of a clustering model to the latter example, the analysis of these two examples is used where possible to check the validity of the sampling theory.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In order to estimate the Hawkes and the Poisson components of N , we propose to leverage the spectral analysis of point processes, recently advocated by Cheysson and Lang (2022) for inference of aggregated Hawkes processes. It consists in considering, for a multivariate point process, its matrix-valued spectral density function, denoted f : R → C d×d , which is related to second-order measures (Bartlett, 1963). Given some observed times (T N1 k ) k≥1 , . . . ...
... The spectral analysis of point processes was introduced in Bartlett (1963) and extended to 2dimensional point processes in Bartlett (1964). Subsequent research works focusing on the theoretical properties of the Bartlett spectrum include Daley (1971); Daley and Vere-Jones (2003); Tuan (1981) for temporal settings and Renshaw (1996, 2001); Rajala et al. (2023) for spatial contexts. ...
... , N d ). This is an extension of the Bartlett spectrum introduced by Bartlett (1963) for the analysis of univariate point processes. Let S be the space of real functions on R with rapid decay (Daley and Vere-Jones, 2003, Chapter 8.6.1): ...
Preprint
Classic estimation methods for Hawkes processes rely on the assumption that observed event times are indeed a realisation of a Hawkes process, without considering any potential perturbation of the model. However, in practice, observations are often altered by some noise, the form of which depends on the context.It is then required to model the alteration mechanism in order to infer accurately such a noisy Hawkes process. While several models exist, we consider, in this work, the observations to be the indistinguishable union of event times coming from a Hawkes process and from an independent Poisson process. Since standard inference methods (such as maximum likelihood or Expectation-Maximisation) are either unworkable or numerically prohibitive in this context, we propose an estimation procedure based on the spectral analysis of second order properties of the noisy Hawkes process. Novel results include sufficient conditions for identifiability of the ensuing statistical model with exponential interaction functions for both univariate and bivariate processes. Although we mainly focus on the exponential scenario, other types of kernels are investigated and discussed. A new estimator based on maximising the spectral log-likelihood is then described, and its behaviour is numerically illustrated on synthetic data. Besides being free from knowing the source of each observed time (Hawkes or Poisson process), the proposed estimator is shown to perform accurately in estimating both processes.
... It is particularly useful, for instance, to study second-order moments, as the covariance matrix of a stationary process is diagonal in the Fourier basis. In the context of point processes, this has been formalized by the covariance measure in the Fourier domain, called the Bartlett spectrum (Bartlett [9], [10,27]), and can be estimated through the Discrete Fourier Transform of observation samples, as we show in Section 2.8.2 of this work. Figure 1.1 illustrates this estimation on a distribution for which the theoretical value of the Bartlett spectrum is known. ...
... There have been extensively studied in the literature (see e.g. [96,9,27]) and constitute some of the most widely used tools to analyse point processes. ...
... Note that we have the following relation z = ρ 0 (z) − ρ π (z) − i(ρ π 2 (z) − ρ 3π 2 (z)). (9) We can then write ...
Thesis
This dissertation presents a class of representations of spatial point processes. Inspired from the success of wavelet methods in signal processing, these descriptors rely on the convolution of a point process with a family of wavelet filters. From these convolutions are built sets of statistical descriptors of stationary point process, by applying non-linear operators, followed by a spatial averaging. Much like classical summary characteristics for point processes, these statistics are designed to extract information about the process with a relatively small number of numerical values, by describing its geometry. Their goal is to describe whether the atoms of the process tend to repel each other, or cluster together, and by doing so, form possibly complex geometric shapes. By construction, these descriptors enjoy several properties that make them suitable for statistical analysis and learning tasks. To illustrate the quality of these representations as statistical descriptors, we study several problems involving statistical analysis of point processes. In a first experiment, we seek to estimate an unknown function that takes as input a point pattern, and returns a marked version of this pattern, where a numerical value is associated to each atom of the pattern. We use a wavelet-based representation of point patterns to estimate the relation between their non-marked and marked version. We then study, in a second experiment, the ability of such representations to model the distribution of a point process, by defining a maximum entropy model defined by a set of wavelet-based statistics, computed on a single observation. For these two problems, we observe that our representations lead to better performance than summary statistics commonly used in the literature on point processes. Finally, to study to what extent such representations can capture geometric structures of texture images, we define a maximum entropy model relying on similar wavelet statistics, yielding syntheses of similar visual quality to state-of-the-art models based on deep convolutional neural networks representations.
... However, Williamson (2001) argues that simple renewal processes are limited in their ability to jointly model packet-level burstiness and the interaction between flows. Hohn et al. (2003) address this concern by suggesting the use of the Bartlett-Lewis renewal process; a branching renewal process with applications in many areas including modelling storm patterns, vehicular traffic, and computer failures (Onof and Wheater, 1993;Bartlett, 1963;Lewis, 1964a,b). A substantive discussion of the original formulations of the process in Bartlett (1963) and Lewis (1964a) is provided in Daley and Vere-Jones (2002). ...
... Hohn et al. (2003) address this concern by suggesting the use of the Bartlett-Lewis renewal process; a branching renewal process with applications in many areas including modelling storm patterns, vehicular traffic, and computer failures (Onof and Wheater, 1993;Bartlett, 1963;Lewis, 1964a,b). A substantive discussion of the original formulations of the process in Bartlett (1963) and Lewis (1964a) is provided in Daley and Vere-Jones (2002). The Bartlett-Lewis model also has natural extensions into higher dimensions (Bartlett, 1964;Daley and Vere-Jones, 2002). ...
... Observed traffic is then obtained by superimposing the main process of leading packets and all its subsidiary processes of non-leading packets. In some contexts such as the analysis of clustered vehicular traffic by Bartlett (1963), the main process is not directly observable. ...
Preprint
The substantial growth of network traffic speed and volume presents practical challenges to network data analysis. Packet thinning and flow aggregation protocols such as NetFlow reduce the size of datasets by providing structured data summaries, but conversely this impedes statistical inference. Methods which aim to model patterns of traffic propagation typically do not account for the packet thinning and summarisation process into the analysis, and are often simplistic, e.g.~method-of-moments. As a result, they can be of limited practical use. We introduce a likelihood-based analysis which fully incorporates packet thinning and NetFlow summarisation into the analysis. As a result, inferences can be made for models on the level of individual packets while only observing thinned flow summary information. We establish consistency of the resulting maximum likelihood estimator, derive bounds on the volume of traffic which should be observed to achieve required levels of estimator accuracy, and identify an ideal family of models. The robust performance of the estimator is examined through simulated analyses and an application on a publicly available trace dataset containing over 36m packets over a 1 minute period.
... [20]. Yet unlike random fields and time series, spectral analysis of point processes is still in its infancy, see also [5], [6], [56], and critically, the digital processing of a point process remains fully outstanding. Recent interest in Fourier features in machine learning based approaches for point patterns such as [38], [44] show the potential of using Fourier based information as features for estimation and detection. ...
... This set of methods has more recently been discovered in machine learning [48], [73]. The corresponding theory for spatial point processes has been neglected apart from some notable and not very recent exceptions [5], [6], [56]. For point processes machine learning researchers have just started to discover the utility of Fourier-based methods [38], [44]. ...
Article
Full-text available
This paper determines how to define a discretely implemented Fourier transform when analysing an observed spatial point process. To develop this transform we answer four questions; first what is the natural definition of a Fourier transform, and what are its spectral moments, second we calculate fourth order moments of the Fourier transform using Campbell’s theorem. Third we determine how to implement tapering, an important component for spectral analysis of other stochastic processes. Fourth we answer the question of how to produce an isotropic representation of the Fourier transform of the process. This determines the basic spectral properties of an observed spatial point process.
... 4 The use of second-order measure (e.g., autocovariance) to describe the temporal dependence of stationary point processes has a long history (e.g., Bartlett 1963;Hawkes 1971;Brillinger 1976). 5 It is possible to extend the inference theory without the stationarity assumption. ...
... The autocovariance function cð'DÞ may remain constant (d ¼ 0) or diminish to zero (d > 0) as the sample size n increases, and is assumed to satisfy absolute summability over lags (as implied by P l>0 jK ' j K). The condition encompasses finite-activity jump processes with finite covariance density (Bartlett 1963). In particular, if K ' ¼ Ae ÀðfÀaÞj'jD D 2 , where 0 < a < f and A > 0 (depending on a and f), the DGP represents a stationary Hawkes process with exponential kernel uðxÞ ¼ ae Àfx (Hawkes 1971). ...
Article
Full-text available
We develop a nonparametric test for the temporal dependence of jump occurrences in the population. The test is consistent against all pairwise serial dependence, and is robust to the jump activity level and the choice of sampling scheme. We establish asymptotic normality and local power property for a rich set of local alternatives, including both self-exciting and/or self-inhibitory jumps. Simulation study confirms the robustness of the test and reveals its competitive size and power performance over existing tests. In an empirical study on high-frequency stock returns, our procedure uncovers a wide array of autocorrelation profiles of jump occurrences for different stocks in different time periods.
... Point processes gained a significant amount of attention in statistics during the 1950s and 1960s. Cox (1955) [16] introduced the notion of a doubly stochastic process Poisson process (called the Cox process now) and Bartlett (1963) [8] investigated statistical methods for point processes based on their power spectral densities. Lewis (1964) [34] formulated a point process model (for computer power failure patters) which was a step in the direction of the HP. ...
... Point processes gained a significant amount of attention in statistics during the 1950s and 1960s. Cox (1955) [16] introduced the notion of a doubly stochastic process Poisson process (called the Cox process now) and Bartlett (1963) [8] investigated statistical methods for point processes based on their power spectral densities. Lewis (1964) [34] formulated a point process model (for computer power failure patters) which was a step in the direction of the HP. ...
Preprint
Full-text available
In this paper, we study various new Hawkes processes. Specifically, we construct general compound Hawkes processes and investigate their properties in limit order books. With regards to these general compound Hawkes processes, we prove a Law of Large Numbers (LLN) and a Functional Central Limit Theorems (FCLT) for several specific variations. We apply several of these FCLTs to limit order books to study the link between price volatility and order flow, where the volatility in mid-price changes is expressed in terms of parameters describing the arrival rates and mid-price process.
... Point processes gained a significant amount of attention in statistics during the 1950s and 1960s. Cox (1955) [16] introduced the notion of a doubly stochastic process Poisson process (called the Cox process now) and Bartlett (1963) [8] investigated statistical methods for point processes based on their power spectral densities. Lewis (1964) [34] formulated a point process model (for computer power failure patters) which was a step in the direction of the HP. ...
... Point processes gained a significant amount of attention in statistics during the 1950s and 1960s. Cox (1955) [16] introduced the notion of a doubly stochastic process Poisson process (called the Cox process now) and Bartlett (1963) [8] investigated statistical methods for point processes based on their power spectral densities. Lewis (1964) [34] formulated a point process model (for computer power failure patters) which was a step in the direction of the HP. ...
Preprint
Full-text available
In this paper, we study various new Hawkes processes. Specifically, we construct general compound Hawkes processes and investigate their properties in limit order books. With regards to these general compound Hawkes processes, we prove a Law of Large Numbers (LLN) and a Functional Central Limit Theorems (FCLT) for several specific variations. We apply several of these FCLTs to limit order books to study the link between price volatility and order flow, where the volatility in mid-price changes is expressed in terms of parameters describing the arrival rates and mid-price process.
... Una prueba formal para verificar la estacionariedad es la prueba B de Bartlett. Esta permite determinar si la serie es estacionaria o no (Bartlett, 1963). Se plantea las siguientes hipótesis con respecto a la serie PIB: H 0 : La serie es estacionaria H 1 : La serie es no estacionaria La Figura 2 muestra los resultados arrojados por la prueba. ...
Article
Full-text available
Este artículo tiene como propósito analizar el efecto de las remesas de los trabajadores sobre el Producto Interno Bruto (PIB) y el déficit comercial en Honduras. El artículo se realizó bajo un enfoque cuantitativo, con datos de series temporales con periodicidad anual de 1990 a 2019 extraídos del Banco Central de Honduras (BCH) y el Banco Mundial (BM). Se establece un modelo del tipo vectores autorregresivos (VAR). Los resultados se presentan mediante figuras de función impulso respuesta (IRF) y tabla de causalidad de Granger del modelo VAR, los cuales revelan una respuesta positiva del PIB y negativa del déficit comercial en el corto y mediano plazo ante el choque provocado por el ingreso de las remesas de los trabajadores. También se encontró que las remesas son predictoras de la variabilidad del PIB al 0,1 de nivel de significancia y del déficit comercial a 0,05 de significancia.
... Remark: The measure C * employed in the proof goes by the name of covariance measure in the literature, see [14, §9.5, Eq. (9.5.12)] for a principled introduction. The covariance density and complete covariance density for point processes had been introduced already at an informal basis [4] before Brémaud's rigorous definition of point processes with stochastic intensity. The measure C * has the covariance density (with respect to the Lebesgue-measure) Ce A(t−s) ΣC T , and the complete covariance density Ce A(t−s) ΣC T +δ(t−s)µ. ...
Preprint
Full-text available
Cellular processes are open systems, situated in a heterogeneous context, rather than operating in isolation. Chemical reaction networks (CRNs) whose reaction rates are modelled as external stochastic processes account for the heterogeneous environment when describing the embedded process. A marginal description of the embedded process is of interest for (i) fast simulations that bypass the co-simulation of the environment, (ii) obtaining new process equations from which moment equations can be derived, (iii) the computation of information-theoretic quantities, and (iv) state estimation. It is known since Snyder’s and related works that marginalization over a stochastic intensity turns point processes into self-exciting ones. While the Snyder filter specifies the exact history-dependent propensities in the framework of CRNs in Markov environment, it was recently suggested to use approximate filters for the marginal description. By regarding the chemical reactions as events, we establish a link between CRNs in a linear random environment and Hawkes processes, a class of self-exciting counting processes widely used in event analysis. The Hawkes approximation can be obtained via moment closure scheme or as the optimal linear approximation under the quadratic criterion. We show the equivalence of both approaches. Furthermore, we use martingale techniques to provide results on the agreement of the Hawkes process and the exact marginal process in their second order statistics, i.e., covariance, auto/cross-correlation. We introduce an approximate marginal simulation algorithm and illustrate it in case studies. AMS subject classifications 37M05, 60G35, 60G55, 60J28, 60K37, 62M15
... Estimation of ρ is treated nonparametrically. As in simpler contexts, such as Bartlett (1963) or for events along a time axis (Cox, 1965), smoothing is required to achieve acceptable estimation variance. This entails some form of weighted averaging of nearby points, ideally with tapered weights for decreasing proximity. ...
Article
Full-text available
This paper is concerned with nonparametric estimation of the intensity function of a point process on a Riemannian manifold. It provides a first-order asymptotic analysis of the proposed kernel estimator for Poisson processes, supplemented by empirical work to probe the behaviour in finite samples and under other generative regimes. The investigation highlights the scope for finite-sample improvements by allowing the bandwidth to adapt to local curvature.
... Modeling time series of counts has received important and growing attention since the 1950s [1][2][3][4][5] and over recent decades (see [6][7][8][9][10]). It is known that some well-known discrete distributions, such as Poisson and negative binomial (NB), can only deal with overdispersion; however, generalized Poisson (GP) and double Poisson (DP) distributions can treat both overdispersion and underdispersion. ...
Article
Full-text available
Crime is a negative phenomenon that affects the daily life of the population and its development. When modeling crime data, assumptions on either the spatial or the temporal relationship between observations are necessary if any statistical analysis is to be performed. In this paper, we structure space–time dependency for count data by considering a stochastic difference equation for the intensity of the space–time process rather than placing structure on a latent space–time process, as Cox processes would do. We introduce a class of spatially correlated self-exciting spatio-temporal models for count data that capture both dependence due to self-excitation, as well as dependence in an underlying spatial process. We follow the principles in Clark and Dixon (2021) but considering a generalized additive structure on spatio-temporal varying covariates. A Bayesian framework is proposed for inference of model parameters. We analyze three distinct crime datasets in the city of Riobamba (Ecuador). Our model fits the data well and provides better predictions than other alternatives.
... describes how we generate randomized versions of our networks. The spectral bipartivity computed on these networks are samples drawn from the reference distribution of this measure, as in a Monte Carlo test (see Bartlett, 1963;Clifford, 1989, 1991, discussion by Barnard, G. A. p. 294). For statistical comparison, we use the two-sample one-tailed Kolmogorov-Smirnov (KS) test (Smirnov, 1948;Dodge, 2008;Virtanen et al., 2020). ...
Article
Full-text available
Production networks are integral to economic dynamics, yet dis-aggregated network data on inter-firm trade is rarely collected and often proprietary. Here we situate company-level production networks within a wider space of networks that are different in nature, but similar in local connectivity structure. Through this lens, we study a regional and a national network of inferred trade relationships reconstructed from Dutch national economic statistics and re-interpret prior empirical findings. We find that company-level production networks have so-called functional structure, as previously identified in protein-protein interaction (PPI) networks. Functional networks are distinctive in their over-representation of closed squares, which we quantify using an existing measure called spectral bipartivity. Shared local connectivity structure lets us ferry insights between domains. PPI networks are shaped by complementarity, rather than homophily, and we use multi-layer directed configuration models to show that this principle explains the emergence of functional structure in production networks. Companies are especially similar to their close competitors, not to their trading partners. Our findings have practical implications for the analysis of production networks and give us precise terms for the local structural features that may be key to understanding their routine function, failure, and growth.
... Point processes gained a significant amount of attention in statistics during the 1950s and 1960s. Cox (1955) introduced the notion of a doubly stochastic Poisson process (called the Cox process now) and Bartlett (1963) investigated statistical methods for point processes based on their power spectral densities. Lewis (1964) formulated a point process model (for computer power failure patterns) which was a step in the direction of the HP. ...
Article
Full-text available
In this paper, we study various new Hawkes processes. Specifically, we construct general compound Hawkes processes and investigate their properties in limit order books. With regard to these general compound Hawkes processes, we prove a Law of Large Numbers (LLN) and a Functional Central Limit Theorems (FCLT) for several specific variations. We apply several of these FCLTs to limit order books to study the link between price volatility and order flow, where the volatility in mid-price changes is expressed in terms of parameters describing the arrival rates and mid-price process.
... Among the different types of point processes, clustering point processes attracted much interest from mathematicians and statisticians. Typical clustering processes include the Neyman-Scott process Scott 1953, 1958), which has been used to describe the distribution of locations of galaxies in the universe, and the Bartlett-Lewis process to model the rainfall process (Bartlett 1963;Lewis 1964). Many spatiotemporal/temporal clustering point processes can be categorized as a Hawkes self-exciting process (Hawkes 1971a, b;Hawkes and Oakes 1974). ...
Article
Full-text available
The Hawkes self-exciting model has become one of the most popular point-process models in many research areas in the natural and social sciences because of its capacity for investigating the clustering effect and positive interactions among individual events/particles. This article discusses a general nonparametric framework for the estimation, extensions, and post-estimation diagnostics of Hawkes models, in which we use the kernel functions as the basic smoothing tool.
... Figure 5 illustrates the results of the disentanglement in the frequency domain. Namely, here we present spectra of point processes (Bartlett measure) [44]. As expected, spectral peaks induced by respiration are enhanced in the R-component and suppressed in the NR-component. ...
Article
Full-text available
We develop a technique for the multivariate data analysis of perturbed self-sustained oscillators. The approach is based on the reconstruction of the phase dynamics model from observations and on a subsequent exploration of this model. For the system, driven by several inputs, we suggest a dynamical disentanglement procedure, allowing us to reconstruct the variability of the system's output that is due to a particular observed input, or, alternatively, to reconstruct the variability which is caused by all the inputs except for the observed one. We focus on the application of the method to the vagal component of the heart rate variability caused by a respiratory influence. We develop an algorithm that extracts purely respiratory-related variability, using a respiratory trace and times of R-peaks in the electrocardiogram. The algorithm can be applied to other systems where the observed bivariate data can be represented as a point process and a slow continuous signal, e.g. for the analysis of neuronal spiking. This article is part of the theme issue ‘Coupling functions: dynamical interaction mechanisms in the physical, biological and social sciences’.
... The most accurate earthquake forecasting models [e.g., Marzocchi (2018)] describe the seismicity as a combination of two main sources: a constant external driving force (the tectonic plate loading) and an intrinsic self-exciting process [the occurrence of an event increases the frequency of the next events; ; Ogata (1998); ; Kagan (1973); Gerstenberger et al. (2005)]. In such models, earthquake occurrence is approximated by a multidimensional Poisson cluster process [Bartlett (1963); Lewis (1964)], where each point of the Poisson process is replaced by a cluster of points. More specifically, events within each cluster are modeled as branching point processes, namely as family trees where each event at any given generation may initiate its own offspring. ...
Thesis
Full-text available
In this work, we aim to contribute to the understanding of the following key points: 1. The relationship between the magnitude of a triggered event and the properties of the seismicity preceding it; 2. The role of the tectonic environment on the earthquakes clustering properties; 3. The (often ignored) presence of potential sources of bias in the b-value estimation that could severely affect any claim about the use of the b-value as a precursor of large earthquakes. This dissertation is organized in three chapters, one for each listed point. Chapter I and Chapter II consist of manuscripts published on Geophysical Journal International and Bulletin of the Seismological Society of America journals, respectively. Chapter III is a document that has been partly included in a publication on Geophysical Journal International. In Chapter I, we focus on the magnitude-independence assumption (the magnitude of any earthquake is independent from the past), which stands behind the most common earthquake forecasting models. The reliability of this assumption, which severely limits the capability to forecast large earthquakes with high probabilities, has been questioned by several authors, who found evidence for correlated magnitudes and/or different seismicity patterns before earthquakes of different magnitudes. Our goal is to contribute to this discussion by empirically investigating the validity of the magnitude-independence assumption through a comprehensive and rigorous analysis. Specifically: i) we implement a metric-based correlation (inspired by the nearest-neighbor method proposed by Baiesi and Paczuski (2004) and elaborated by Zaliapin et al. (2008)) to identify the precursory seismicity, avoiding the use of space-time-magnitude windows for the identification of foreshocks, mainshocks and aftershocks; ii) we consider different instrumental catalogs and multiple synthetic (ETAS) catalogs; iii) we carefully consider spatiotemporal variations of the magnitude of completeness when statistically comparing the frequency-magnitude distribution of background and triggered earthquakes; iv) we statistically analyze different space-time-magnitude features of the seismicity which anticipates a triggered event. Our findings are in agreement with the magnitude-independence assumption which stands behind the most common earthquake forecast models. We find only one departure from the expected model: larger events tend to nucleate at a higher distance from the ongoing sequence. This result, which confirms the findings of a previous independent study, could be hopefully used to improve the current forecasting models. We also notice that the reliability of the magnitude-independence assumption may depend on the spatial scale considered, as we identify possible departures in small areas, which could reflect different ways to release locally seismic energy. Finally, we show that some significant departures from the magnitude-independence assumption do not survive when considering spatiotemporal variations of the magnitude of completeness. In Chapter II, we contribute to answer the following question: do earthquake cluster properties change depending on the tectonic environment under consideration? The common answer to this question is positive, and it is generally based on discrepancies observed among a limited number of seismic sequences pertaining different tectonic environments. We contribute to this discussion analyzing the clustering properties in three areas related to distinct tectonic regimes: Italy, Southern California and Japan. Specifically, we investigate on this aspect by i) adopting the nearest-neighbor method [Baiesi and Paczuski (2004); Zaliapin et al. (2008)] to identify all the sequences of triggered events; ii) implementing a comprehensive statistical analysis of several features of the seismicity recorded in different instrumental catalogs. We demonstrate that sequences of triggered events are characterized by comparable distributions through the three regions, though the latter are characterized by a different dominant fault type (compressional, extensional or strike-slip). We conclude that, at least for active seismic crustal regions, the tectonic regime does not seem to play a key role in affecting the spatiotemporal clustering properties. These findings have two important implications: first, they suggest that calibrating earthquake forecast models based on the specific tectonic regime may not be necessary, at least in the case of active seismic crustal regions; second, they support the use of common declustering models, avoiding the need to re-calibrate them according to the tectonic environment. As regards the background seismicity, it shows some significant departures from the hypothesis of a stationary Poisson process when looking at short time scales (1-year time intervals). This could suggest transient perturbations of the background seismicity rate due to undetected seismicity or to localized physical processes, whose effects seem to decrease on longer time scales. In Chapter III, we focus on the b-value estimation biases. The main motivation is that its variability has often been used to advocate its use as an earthquake precursor, and/or as a stress indicator. We first provide a broad review of the most common procedures for identifying temporal variations, which have been linked to the preparatory phase of large earthquakes. As a matter of fact, many studies claim that a decrease of the b-value occurs before large earthquakes, a feature which is generally explained in terms of increased stress. We then argue that several factors, which are very often neglected in seismology, may induce bias in the b-value estimation and lead to apparent variations that do not have any real physical meaning. To show that, we: i) analyze three theoretical biases: the Jensen inequality, the normal approximation for the b-value error estimate, and the correlation between the b-value and the maximum foreshock magnitude in the sequence; ii) we quantify (by numerical simulations) the bias induced by an improper magnitude of completeness selection, which affects the analysis of real seismic catalogs. We show that: i) the Jensen inequality and the normal approximation for the b-value error estimate influence, at different degree, the b-value and its uncertainty for catalogs with less than 100 data; ii) the maximum magnitude in the dataset may introduce a bias in the b-value estimate, by yielding a systematic underestimation of the b-value when a large earthquake is included; this bias is particularly severe for small datasets (N <= 500); iii) it is very likely to estimate a significantly low b-value only as a consequence of the incompleteness of the catalog and goodness-of-fits test, such as the Lilliefors test, could fail to spot an incomplete recording for small datasets, making this bias very difficult to be recognized: iv) assuming, for any space-time subset of a catalog, the same completeness threshold as for the whole catalog could induce strong errors. Our analysis casts doubt on many claims in the literature about significant b-value variations.
... The second term on the right side of the above equality is a Poisson shot noise process, which has been widely applied to e.g. bunching in traffic [5], computer failure times [28], earthquake aftershocks [36], insurance [26], finance [32] and workload input models [29]. When the correlation function E[ψ(kt, η)ψ(ks, η)] is regularly varying as k → ∞ for any t, s ≥ 0, Klüppelberg and Mikosch [27] proved the weak convergence of the normalized Poisson shot noise process to a self-similar Gaussian process, which is a Brownian motion when the shot shape function is light-tailed, i.e., E[ψ(t, η)] = C + o(1/ √ t). ...
Preprint
This paper establishes a functional law of large numbers and a functional central limit theorem for marked Hawkes point measures and their corresponding shot noise processes. We prove that the normalized random measure can be approximated in distribution by the sum of a Gaussian white noise process plus an appropriate lifting map of a correlated one-dimensional Brownian motion. The Brownian results from the self-exiting arrivals of events. We apply our limit theorems for Hawkes point measures to analyze the population dynamics of budding microbes in a host.
... The spectral domain provides a rich environment for representing this second order structure and is based on the fact that stationary stochastic processes can be considered a composite of subprocesses operating at different frequencies. The spectral density matrix of a stationary point process is the Fourier transform of its covariance density matrix (Bartlett, 1963), namely ...
Preprint
Wavelets provide the flexibility to detect and analyse unknown non-stationarity in stochastic processes. Here, we apply them to multivariate point processes as a means of characterising correlation structure within and between multiple event data streams. To provide statistical tractability, a temporally smoothed wavelet periodogram is developed for both real- and complex-valued wavelets, and shown to be equivalent to a multi-wavelet periodogram. Under certain regularity assumptions, the wavelet transform of a point process is shown to be asymptotically normal. The temporally smoothed wavelet periodogram is then shown to be asymptotically Wishart distributed with tractable centrality matrix and degrees of freedom computable from the multi-wavelet formulation. Distributional results extend to wavelet coherence; a time-scale measure of inter-process correlation. The presented theory and methodology are verified through simulation and applied to neural spike train data.
... Put differently, we want the estimator to optimally separate neighboring peaks from each other, as well as from the noise. These random instances of f at delays other than τ contribute to background spectrum according to the power spectrum of the rate function of the emitting point process, λ( ω), combined with that of the feature, giving: (39) If the point process is homogenous, λ is equal to constant squared intensity [4] . The same argument applies for any WSS noise, whether Gaussian or not; the denominator always contains the combined total power of the signal and all other additive processes contributing to the error in the peak location. ...
Article
Full-text available
Like the ordinary power spectrum, higher-order spectra (HOS) describe signal properties that are invariant under translations in time. Unlike the power spectrum, HOS retain phase information from which details of the signal waveform can be recovered. Here we consider the problem of identifying multiple unknown transient waveforms which recur within an ensemble of records at mutually random delays. We develop a new technique for recovering filters from HOS whose performance in waveform detection approaches that of an optimal matched filter, requiring no prior information about the waveforms. Unlike previous techniques of signal identification through HOS, the method applies equally well to signals with deterministic and non-deterministic HOS. In the non-deterministic case, it yields an additive decomposition, introducing a new approach to the separation of component processes within non-Gaussian signals having non-deterministic higher moments. We show a close relationship to minimum-entropy blind deconvolution (MED), which the present technique improves upon by avoiding the need for numerical optimization, while requiring only numerically stable operations of time shift, element-wise multiplication and averaging, making it particularly suited for real-time applications. The application of HOS decomposition to real-world signals is demonstrated with blind denoising, detection and classification of normal and abnormal heartbeats in electrocardiograms.
... We remark that although spectral techniques have become a prominent tool for the analysis of time series data and certain advantages exist, these techniques have not been studied and applied to spatial point processes much so far and the number of methodological and applied contributions remain limited. The content presented here can be understood as a straightforward extension of the spectral analysis of point events in the temporal domain, as described by Bartlett (1963) and Brillinger (1972), to the two-dimensional spatial case first presented by Bartlett (1964). Further contributions to the spectral analysis of spatial point processes can be found in the papers of Mugglestone (1990), Mugglestone andRenshaw (1996a,b, 2001), Renshaw (1997Renshaw ( , 2002, Renshaw andFord (1983, 1984) and Saura and Mateu (2006) which serve as fundamental references in this section. ...
Preprint
Full-text available
This paper is concerned with the joint analysis of multivariate mixed-type spatial data, where some components are point processes and some are of lattice-type by nature. After a survey of statistical methods for marked spatial point and lattice processes, the class of multivariate spatial hybrid processes is defined and embedded within the framework of spatial dependence graph models. In this model, the point and lattice sub-processes are identified with nodes of a graph whereas missing edges represent conditional independence among the components. This finally leads to a general framework for any type of spatial data in a multivariate setting. We demonstrate the application of our method in the analysis of a multivariate point-lattice pattern on crime and ambulance service call-out incidents recorded in London, where the points are the locations of different pre-classified crime events and the lattice components report different aggregated incident rates at ward level.
... Figure 5 illustrate the results of the disentanglement in the frequency domain. Namely, here we present spectra of point processes (Bartlett measure) 43 . As expected, spectral peaks induced by respiration are enhanced in the R-component and suppressed in the NR-component. ...
Preprint
Full-text available
We develop a technique for the multivariate data analysis of perturbed self-sustained oscillators. The approach is based on the reconstruction of the phase dynamics model from observations and on a subsequent exploration of this model. For the system, driven by several inputs, we suggest a dynamical disentanglement procedure, allowing us to reconstruct the variability of the system's output that is due to a particular observed input, or, alternatively, to reconstruct the variability which is caused by all the inputs except for the observed one. We focus on the application of the method to the vagal component of the heart rate variability caused by a respiratory influence. We develop an algorithm that extracts purely respiratory-related variability, using a respiratory trace and times of R-peaks in the electrocardiogram. The algorithm can be applied to other systems where the observed bivariate data can be represented as a point process and a slow continuous signal, e.g. for the analysis of neuronal spiking.
... Put differently, we want the estimator to optimally separate neighboring peaks from each other, as well as from the noise. These random instances of f at delays other than τ contribute to background spectrum according to the power spectrum of the rate function of the emitting point process, λ( ω), combined with that of the feature, giving: (39) If the point process is homogenous, λ is equal to constant squared intensity [4] . The same argument applies for any WSS noise, whether Gaussian or not; the denominator always contains the combined total power of the signal and all other additive processes contributing to the error in the peak location. ...
Preprint
Full-text available
Like the ordinary power spectrum, higher-order spectra (HOS) describe signal properties that are invariant under translations in time. Unlike the power spectrum, HOS retain phase information from which details of the signal waveform can be recovered. Here we consider the problem of identifying multiple unknown transient waveforms which recur within an ensemble of records at mutually random delays. We develop a new technique for recovering filters from HOS whose performance in waveform detection approaches that of an optimal matched filter, requiring no prior information about the waveforms. Unlike previous techniques of signal identification through HOS, the method applies equally well to signals with deterministic and non-deterministic HOS. In the non-deterministic case, it yields an additive decomposition, introducing a new approach to the separation of component processes within non-Gaussian signals having non-deterministic higher moments. We show a close relationship to minimum-entropy blind deconvolution (MED), which the present technique improves upon by avoiding the need for numerical optimization, while requiring only numerically stable operations of time shift, element-wise multiplication and averaging, making it particularly suited for real-time applications. The application of HOS decomposition to real-world signals is demonstrated with blind denoising, detection and classification of normal and abnormal heartbeats in electrocardiograms.
... L: Yes, I believe the first discussion of Monte Carlo tests says, "suppose you could repeat this ten times. . . " (Barnard, 1963 in discussion of Bartlett, 1963). I believe Julian Besag in 1975 points out that MCMC could work, but seemed impractical (Besag, 1975). ...
Article
Thomas A. Louis received his BA in Mathematics from Dartmouth College in 1966 and his Ph.D. in Mathematical Statistics from Columbia University in 1972. He served as a NIH Postdoctoral Fellow at Imperial College, London, from 1972-1973 and has held faculty positions at Boston University, Harvard School of Public Health, the University of Minnesota and the Bloomberg School of Public Health at Johns Hopkins University. In addition, he served as a Senior Statistical Scientist at the RAND Corporation, and as Associate Director for Research and Methodology and Chief Scientist at the U.S. Census Bureau. Tom has served as President of both the Eastern North American Region of the International Biometric Society and as President of the International Biometric Society. He is a Fellow of the American Statistical Association, the American Association for the Advancement of Science and the Institute of Mathematical Statistics. As of January 2018, Tom is Emeritus Professor, Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University. In addition to his many statistical accomplishments, Tom is a strong advocate for professional development and a life-long lover of time on the water.
... Cox [32] introduced the idea of doubly stochastic Poisson process, and it is studied in detail by Bartlett [10]. A doubly stochastic Poisson process is a generalization of a Poisson process where the arrival rate itself is a stochastic process. ...
Thesis
Most manufacturing models to date have assumed independence of all random variables in the system. In practice, autocorrelation effects are present in production lines time series. In this thesis, we extend this literature by studying autocorrelation in machine times to failure in detail. Our work focuses on the practical aspects of detecting and modelling autocorrelated uptimes, as well as including them in simulations. We apply a practical procedure to detect autocorrelation in uptimes. The procedure has very mild assumptions and compensates for the number of machines it is applied to, ensuring that the probability of a Type I error is kept low. We then provide two ways to model autocorrelated times to failures. The first is to use ARMA models including GARCH terms. We also provide a method based on the Markov-Modulated Poisson Process, a special case of the Markov Arrival Process. For both methods discussed above, we provide diagnostic plots and a quantitative way to select the most appropriate model for a given series of uptimes. This allows us to automatically select an appropriate model. Finally, to enable Ford to use our methods in simulation, we provide a way to generate simulated uptimes from each of our models.
Article
Measuring the influence of scientists and their activities on science and society is important and indeed essential for many studies. Despite the substantial efforts devoted to exploring the influence’s measures and patterns of an individual scientific enterprise, it remains unclear how to quantify the mutual impact of multiple scientific activities. This work quantifies the relationship between the scientists’ interactive activities and their influences with different patterns in the AMiner dataset. Specifically, inflation treatment and field normalization are introduced to process the big data of paper citations as the scientist’s influence, and then the evolution of the influence is investigated for scientific activities in the citation and cooperation patterns through the Hawkes process. The results show that elite scientists have higher individual and interaction influences than ordinary scientists in all patterns found in the study, with permutation tests verifying the significance of the new findings. Moreover, the study compares the patterns found in two largest disciplines, i.e., STEM and Humanities , revealing the higher value of individual influence in STEM than in Humanities . Furthermore, it is found that the opposite trend of STEM and Humanities in the cooperation pattern suggests different cooperation habits of scientists in different disciplines. Overall, this investigation provides a feasible approach to addressing the scientific influence issue and deepening the quantitative understanding of the mutual influence of multiple scientific activities in science and society.
Article
This article proposes two exploratory methods for analyzing event patterns. Events in this article refer to zero‐dimensional objects in the spatiotemporal dimension, such as crimes, earthquakes, and traffic accidents. One method detects the peaks in event patterns, evaluates the degree of event concentration at the peaks, and visualizes its spatial variation. Another method evaluates the similarity between different event patterns and visualizes its spatial variation. The methods help us understand events' properties, consider their underlying mechanisms, and permit us to prevent events if they represent undesirable phenomena such as crimes and traffic accidents. The proposed methods are applied to analyze the population distribution in the central area of Tokyo in May 2019. The application revealed the spatial variation of population peaks in this area and the differences in population patterns between different types of days.
Article
A spectral theory for constituents of macroscopically homogeneous random microstructures is developed and provided with a sound mathematical basis, where the spectrum obtained by Fourier methods corresponds to the angular intensity distribution of X-rays scattered by this constituent. It is shown that the fast Fourier transform applied to three-dimensional images of microstructures obtained by micro-tomography is a powerful tool of image processing. The applicability of this technique is demonstrated in the analysis of images of porous media.
Chapter
With essential background and core concepts outlined in Chap. 2, we now turn to discussing Hawkes processes, including their useful immigration–birth representation and briefly touching on generalisations.
Article
We explore the vibration isolation requirements imposed by Newtonian noise on the cryogenic shielding for next-generation gravitational-wave observatories relying on radiative cooling. Two sources of Newtonian noise from the shield arrays are analyzed: the tidal coupling from the motion of a shield segment to its nearby test mass’s displacement, and the effect of density fluctuations due to heterogeneous boiling of cryogenic liquids in the vicinity of the test masses. It was determined that the outer shields require no additional vibration isolation from ground motion to mitigate the Newtonian noise coupling to levels compatible with the LIGO Voyager design. Additionally, it was determined that the use of boiling nitrogen as the heat sink for the cryogenic arrays is unlikely to create enough Newtonian noise to compromise the detector performance for either Voyager or Cosmic Explorer phase 2. However, the inherent periodicity of the nucleation cycle might acoustically excite structural modes of the cryogenic array which could contaminate the signals from the interferometer through other means. This last effect could be circumvented by using a single-phase coolant to absorb the heat from the cryogenic shields.
Article
Full-text available
Wavelets provide the flexibility to analyse stochastic processes at different scales. Here, we apply them to multivariate point processes as a means of detecting and analysing unknown non-stationarity, both within and across data streams. To provide statistical tractability, a temporally smoothed wavelet periodogram is developed and shown to be equivalent to a multi-wavelet periodogram. Under a stationary assumption, the distribution of the temporally smoothed wavelet periodogram is demonstrated to be asymptotically Wishart, with the centrality matrix and degrees of freedom readily computable from the multi-wavelet formulation. Distributional results extend to wavelet coherence; a time-scale measure of inter-process correlation. This statistical framework is used to construct a test for stationarity in multivariate point-processes. The methodology is applied to neural spike train data, where it is shown to detect and characterize time-varying dependency patterns.
Article
The nonhomogeneous Poisson process (NHPP) and the renewal process (RP) are two stochastic point process models that are commonly used to describe the pattern of repeated occurrence data. An inhomogeneous Gamma process (IGP) is a point process model that generalizes both the NHPP and a particular RP, commonly referred to as a Gamma renewal process (GRP), which has interarrival times that are independent and identically distributed gamma random variables with unit scale parameter and shape parameter K > 0. This paper focuses on a particular class of the IGP which has a periodic or almost periodic baseline intensity function and a shape parameter K ϵ ℕ. This model deals with point events that show a pattern of periodicity or almost periodicity. Consistent estimators of unknown parameters are constructed mainly by the Bartlett periodogram. Simulation results that support theoretical findings are provided.
Preprint
This paper contributes to the multivariate analysis of marked spatio-temporal point process data by introducing different partial point characteristics and extending the spatial dependence graph model formalism. Our approach yields a unified framework for different types of spatio-temporal data including both, purely qualitatively (multivariate) cases and multivariate cases with additional quantitative marks. The proposed graphical model is defined through partial spectral density characteristics, it is highly computationally efficient and reflects the conditional similarity among sets of spatio-temporal sub-processes of either points or marked points with identical discrete marks. The paper considers three applications, two on crime data and a third one on forestry.
Article
The second order statistical properties of time point processes (PP) are described by the time coincidence function (CF) and the frequency Bartlett spectrum (BS). For PPs recorded by pulses appearing at random time instants, as in photodetection experiments, the CF can be measured by various physical devices showing in particular the famous bunching effect of photons. On the other hand, for PPs recorded by the intervals between successive points (lifetimes), especially for renewal processes, there is no usual procedure for the estimation of the CF, and the aim of this paper is to describe an approach of this problem. The starting point is a mathematical relation between the CF and the set of probabilities density functions of the lifetimes of any order of the PP. As a consequence, the CF can be obtained by processing the results of the multiple normalized histograms of these lifetimes. In the cases, relatively rare, where the mathematical expression of the CF is known in closed form, the correct behavior of the procedure is verified by an experimental analysis of simulated data. The method is extended in order to verify the relationship between the CF and the BS.
Article
Full-text available
Exploiting the fact that most arrival processes exhibit cyclic behaviour, we propose a simple procedure for estimating the intensity of a nonhomogeneous Poisson process. The estimator is the super-resolution analogue to Shao (2010) and Shao and Lii [J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 (2011) 99–122], which is a sum of p sinusoids where p and the amplitude and phase of each wave are not known and need to be estimated. This results in an interpretable yet flexible specification that is suitable for use in modelling as well as in high resolution simulations. Our estimation procedure sits in between classic periodogram methods and atomic/total variation norm thresholding. Through a novel use of window functions in the point process domain, our approach attains super-resolution without semidefinite programming. Under suitable conditions, finite sample guarantees can be derived for our procedure. These resolve some open questions and expand existing results in spectral estimation literature.
Conference Paper
In neuroscience, it is of key importance to assess how neurons interact with each other as evidenced via their firing patterns and rates. We here introduce a method of smoothing the wavelet periodogram (scalogram) in order to reduce variance in spectral estimates and allow analysis of time-varying dependency between neurons at different scale levels. Previously such smoothing methods have only received analysis in the setting of regular real-valued (Gaussian) time- series. However, in the context of neuron-firing, observations may be modelled as a point-process which when binned, or aggregated, gives rise to an integer-valued time-series. In this paper we propose an analytical asymptotic distribution for the smoothed wavelet spectra, and then contrast this, via synthetic experiments, with the finite sample behaviour of the spectral estimator. We generally find good alignment with the asymptotic distribution, however, this may break down if the level of smoothing, or the scale under analysis is very small. To conclude, we demonstrate how the spectral estimator can be used to characterize real neuron-firing dependency, and how such relationships vary over time and scale.
Article
In recent years, with the rapid development of mobile app ecosystem, the number and categories of mobile apps have grown tremendously. However, the global prevalence of mobile apps also leads to fierce competition. As a result, many apps will disappear. To thrive in this competitive app market, it is vital for app developers to understand the popularity evolution of their mobile apps, and inform strategic decision-making for better mobile app development. Therefore, it is significant and necessary to model and forecast the future popularity evolution of mobile apps. The popularity evolution of mobile apps is usually a long-term process, affected by various complex factors. However, existing works lack the capabilities to model such complex factors. To better understand the popularity evolution, in this paper, we aim to forecast the popularity evolution of mobile apps by incorporating complex factors, i.e., exogenous stimulis and endogenous excitations. Specifically, we propose a model based on the Multivariate Hawkes Process (MHP), which is an exogenous stimulis-driven self-exciting point process, to model the exogenous stimulis and endogenous excitations simultaneously. Extensive experimental studies on a real-world dataset from app store demonstrate that MHP outperforms the state-of-the-art methods regarding popularity evolution forecasting.
Article
S ummary The sequence of intervals between events in a stationary point process is considered as a process in continuous time. Expressions are given for a covariance function and the corresponding spectrum. Some examples are given to illustrate that the properties of the intervals as a process in continuous time are not the same as those of the intervals as a sequence. The methods developed are used to analyse a process consisting of times of R peaks in an electrocardiogram record. A remarkable similarity between this analysis and one based on the counting process is noted.
Article
The properties of non‐homogeneous branching Poisson processes are derived, using some particular conditional distributional results for the non‐homogeneous Poisson process. The properties derived include the Laplace‐Stieltjes transforms of the cumulant generating functions of the number of events by time x and of the state of the process at time x.
Article
Past studies have shown that crime events are often clustered. This study proposes a spatiotemporal Hawkes‐type point process model, which includes a background component with daily and weekly periodization, and a clustering component that is triggered by previous events. We generalize the non‐parametric stochastic reconstruction method so that we can estimate each component in the background rate and the triggering response that appears in the model conditional intensity: the background rate includes a daily and a weekly periodicity, a separable spatial component and a long‐term background trend. Two relaxation coefficients are introduced to stabilize and secure the estimation process. This model is used to describe the occurrences of violence or robbery cases in Castellón, Spain, during 2 years. The results show that robbery crime is highly influenced by daily life rhythms, revealed by its daily and weekly periodicity, and that about 3% of such crimes can be explained by clustering. Further diagnostic analysis shows that the model could be improved by considering the following ingredients: the daily occurrence patterns are different between weekends and working days; in the city centre, robbery activity shows different temporal patterns, in both weekly periodicity and long‐term trend, from other suburb areas.
Book
Intended for a second course in stationary processes, Stationary Stochastic Processes: Theory and Applications presents the theory behind the field’s widely scattered applications in engineering and science. In addition, it reviews sample function properties and spectral representations for stationary processes and fields, including a portion on stationary point processes. Features • Presents and illustrates the fundamental correlation and spectral methods for stochastic processes and random fields • Explains how the basic theory is used in special applications like detection theory and signal processing, spatial statistics, and reliability • Motivates mathematical theory from a statistical model-building viewpoint • Introduces a selection of special topics, including extreme value theory, filter theory, long-range dependence, and point processes • Provides more than 100 exercises with hints to solutions and selected full solutions This book covers key topics such as ergodicity, crossing problems, and extremes, and opens the doors to a selection of special topics, like extreme value theory, filter theory, long-range dependence, and point processes, and includes many exercises and examples to illustrate the theory. Precise in mathematical details without being pedantic, Stationary Stochastic Processes: Theory and Applications is for the student with some experience with stochastic processes and a desire for deeper understanding without getting bogged down in abstract mathematics.
Article
Full-text available
In economic and financial time series we sometimes observe sudden and large price jumps. Although these events are relatively rare, they have significant impacts on not only a given financial market but also several different markets and wider macro economies. Using simultaneous Hawkes-type multivariate point process (SHPP) models, it is possible to analyze the causal effects of large events in the sense of the Granger-non-causality (GNC) from one market to other markets as well as the instantaneous Granger-non-causality (IGNC). We investigate the financial market of Tokyo and other major markets, and apply GNC tests to investigate the interdependence of large events among markets. Several important empirical findings emerge among financial markets and wider macro economies.
Article
Suppose events, planned as regular, in fact occur at instants displaced from the planned arithmetic progression by independent random deviations. Assume these deviations have a common continuous distribution, with variance so large that the realized order of the events differs from the planned order. The paper discusses statistical properties of the time intervals between such events, and compares some of the theoretical formulae with some known empirical results.
Article
The paper is concerned with the lengths of intervals in a stationary point process. Relations are given between the various probability functions, and moments are considered. Two different random variables are introduced for the lengths of intervals, according to whether the measurement is made from an arbitrary event or beginning at an arbitrary time, and their properties are compared. In particular, new properties are derived for the correlation coefficients between the lengths of successive intervals. Examples are given. A theorem is proved, giving conditions under which two independent stationary point processes with independent intervals may be superposed, giving a new point process which also has independent intervals. Mention is made of the application to the theory of binary random processes and to the zeros of a Gaussian process.
Article
It is frequently useful to test, on the basis of life test data, whether or not one is justified in assuming that the underlying distribution of life is exponential. This paper, which appears in two parts, describes a number of graphical and analytical procedures for testing this assumption. Part I of the paper contains descriptions of the mathematical and graphical procedures. Part II contains several worked examples.
Article
The paper deals with a number of problems of statistical analysis connected with events occurring haphazardly in space or time. The topics discussed include: tests of randomness, components of variance, the correlation between events of different types, and a modification of the snap‐round method used in operational research.
Article
The term 'point processes ', referring to stochastic processes in which events occur at more or less irregular intervals and which are represented by points on the time-axis, is of comparatively recent origin, although the existence of such processes has in fact been well known for a long time. They have been discussed fairly extensively in such diverse applications as the counting of radioactive impulses, telephone calls and cases of contagious diseases. Wold (1949) developed a statistical theory for treating processes of this type, and also mentioned briefly how the events could take place in a two-dimensional or higher field. Such a generalization, from events with no time extension to those with no 'space' extension (i.e. specifically of a point character), has a suitable field of application in the ecological study of the distributional pattern of plants. If we can assume to a first approximation that the plants have the dimensions of a point, then we shall see that it is possible to discuss precisely probability relationships between the numbers of plants in different areas of the region under investigation. The main aims of quantitative ecology are the precise description of a community of plants with interpretations in terms of the biology of the species, and the correlation of vegetational and environmental data, and ecologists have used several methods in an attempt to achieve these aims. In most of the initial work on field sampling for ecological data, the procedure was to take 'quadrats' (sample areas small in relation to the total area of the region) scattered at random over the area, and study statistics derived from the frequency distribution of the numbers of plants per quadrat. While this approach is useful to some extent, in that any given type of distribution function may be fitted to the data, it does not necessarily furnish the kind of information required by an ecologist. It will not give any evidence of trends, or indicate the pattern of the distribution over the area or the way in which this pattern may have arisen, all factors of prime importance in the study of the structure of a plant community. We only have to cite the negative binomial distribution, which is known to arise in at least four different ways, all based on widely differing assumptions, to illustrate this point. In recent years ecologists have become aware of the need for a more satisfactory approach to the problem, and Greig-Smith (1952) provided a potentially great advance on the statistical side when he recommended the use of a grid of contiguous quadrats over some portion or portions of the region. The advantage, of course, in arranging the quadrats in a grid is that the analysis of variance technique may be employed, either for the detection of trends, or, more importantly, for the detection of a mosaic variation in density (due to ecological causes connected with the spread of the plants) by a 'nested sampling' type of analysis of variance, associating the quadrats into successively larger blocks and comparing the component block variances. The details and applications of this method are described at length by Greig-Smith, together with the results from sampling experiments on artificial
Molecular Theory of Fluids.
  • H. S. Green