Statistical Analysis

Next: Simulation Procedures Up: Homogeneous Poisson Processes Previous: Definition and Basic Properties

Statistical Analysis

The estimation of the intensity $\lambda$ is the most fundamental question in statistical analysis of homogeneous Poisson processes. There are several approaches which are based on observing a sample of X in a certain (bounded) observation window $B \subset \RL^2$ . One of these approaches considers the number of points lying in each of a set of n randomly or systematically located sampling squares (or other subregions) of equal area, say a². Then, for observed counts $k_1,\ldots,k_n$ , a natural estimator for $\lambda$ is given by $\hat{\lambda} = \sum_{i=1}^n k_i/(na^2)$ .For n large and a² small, another estimate of $\lambda$ can be obtained by computing the fraction $\hat{p}$ of empty squares and then by solving the equation $\hat{p} = \exp (-\lambda a^2)$ . We remark that in the literature further estimators for $\lambda$ are proposed which are based on measuring the distances from sampling points to certain neighboring points of the Poisson process, see e.g. Diggle (1983, 1996), Ripley (1981).

A more fundamental statistical question is to test the hypothesis that a given planar point pattern is a realization of a homogeneous Poisson process. There exists a large number of such tests which are based on different properties of the Poisson process, see Stoyan et al. (1995). Like the methods for estimating the intensity $\lambda$ , one can principially distinguish between tests based on measuring distances between points and tests based on counting points. The basis of tests where distance methods are used, is (2.3), that is the fact that the spherical contact distribution function H(r) of a homogeneous Poisson process is equal to its nearest neighbor distance distribution function D(r). In connection with this, one has to consider suitable estimators for H(r) and D(r). A simple unbiased estimator for H(r) is given by

$\begin{displaymath} \hat{H}^{(1)}(r) = \frac{\nu\left( (B \ominus b(o,r)) \cap \bigcup_n b(X_n,r) \right)}{\nu \left( B \ominus b(o,r)\right)}\end{displaymath}$ (4)

for $0\leq r \leq \frac{1}{2} \mbox{diam}(B)$ , where $\bfind(A)$ denotes the indicator function and b(x,r) the circle with center x and radius r. The symbol $\ominus$ means Minkowski subtraction, i.e.,

$\begin{displaymath} B \ominus b(o,r) = \bigcap_{x \in b(o,r)} (B+x).\end{displaymath}$

The estimator $\hat{H}^{(1)}(r)$ and the set $B \ominus b(o,r)$ are illustrated in Figure 2, where the estimator $\hat{H}^{(1)}(r)$ is the fraction of the shaded area to $B \ominus b(o,r)$ .

**Figure 2:** The estimator $\hat{H}^{(1)}(r)$ and the set $B \ominus b(o,r)$
$\begin{figure} \begin{center} { \epsfig {file=minus.eps,height=4cm} }\end{center}\end{figure}$

Note, however, that the estimator given in (2.4) needs not to be monotone in r. Another estimator for H(r) which does not have this disadvantage goes back to an idea of Hanisch (1984):

$\begin{displaymath} \hat{H}(r) = \int_{\RL^2} \, \frac{\bfind \left( x \in B \o... ... r\right)}{\nu \left( B \ominus b(o,\Delta(x)) \right)} \; dx \end{displaymath}$

(5)

for $0\leq r \leq \frac{1}{2} \mbox{diam}(B)$ , where $\Delta(x)$ is the distance from x to its nearest neighbour in $X=\{X_1,X_2,\ldots\}$ .

An asymptotically unbiased estimator for D(r) which is analogous to that in (2.4), is given by

$\begin{displaymath} \hat{D}^{(1)}(r) = \frac{\sum\limits_n \, \bfind \left( X_n ... ...Delta(X_n) \leq r \right)}{X \left( B \ominus b(o,r) \right)},\end{displaymath}$ (6)

which is the fraction of the number of points in $B \ominus b(o,r)$ whose nearest neighbor has distance less or equal to r, to the number of points in $B \ominus b(o,r)$ .Again, this estimator needs not to be monotone in r. The following estimator for D(r) proposed by Hanisch (1984): does not have this drawback:

$\begin{displaymath} \hat{D}(r) = \frac{\hat{D}^{*}(r)}{\hat{D}^{*}(R)} \qquad \mbox{ for } 0 \leq r \leq R\end{displaymath}$ (7)

where

$\begin{displaymath} \hat{D}^{*}(r) = \sum_n \, \frac{\bfind \left( X_n \in B \om... ... \leq r \right)}{ \nu \left( B \ominus b(o,\Delta(X_n))\right)}\end{displaymath}$

and $R = \sup \{r \gt 0 \, : \, \nu (B \ominus b(0,r)) \gt 0 \}$ .

Now, the idea for testing the Poisson hypothesis on the basis of (2.3) is to check whether the empirical counterparts $\hat{H}(r)$ and $\hat{D}(r)$ of H(r) and D(r) respectively, computed from the observed point pattern, are significantly different from each other; see also van Lieshout and Baddeley (1996), Bedford and van den Berg (1997). In Figure 3 this method is illustrated using the data from Figure 1.

**Figure 3:** Estimators $\hat{D}(r)$ and $\hat{H}(r)$
$\begin{figure} \beginpicture \setlinear \setcoordinatesystem units <0.11cm,4cm\g... ...0 47 1.000000 48 1.000000 49 1.000000 50 1.000000 / \endpicture\end{figure}$

Note that as usual in spatial statistics, edge effects are an important component in the estimation of H(r) and D(r). Such effects occur if the nearest neighbor of a point lies outside of the observation window B. In (2.5) and (2.7) an edge-correction based on minus-sampling is considered, which means that the estimators are based on subwindows $B^* \subset B$ such that all neighbors with distance less than r lie within the observation window B. A new approach to edge corrected estimation of D(r) has recently been presented in Floresroux and Stein (1996). Further related estimators for characteristics of planar point processes can be found e.g. in Baddeley and Gill (1997), Hansen et al. (1996,1998); see also Jensen (1993).

Another type of test for verifying the Poisson hypothesis is based on the reduced second-order moment measure K of a stationary point process X which is defined in the following way. Note that this method can only be used in case of exhaustively mapped data, i.e., when a complete set of the locations of all points in a pattern is available. Assume that X has intensity $\lambda$ and consider the second moment measure

$\begin{displaymath} \mu^{(2)}(B_1 \times B_2) = \Exp (X(B_1)X(B_2))\end{displaymath}$ (8)

for $B_1,B_2 \in \calB$ . Then, the reduced second-moment measure K of X is uniquely determined by the equation

$\begin{displaymath} \mu^{(2)}(B_1 \times B_2) = \lambda \nu (B_1 \cap B_2) + \la... ... \int \int \bfind(x \in B_1) \bfind(x+y\in B_2) \, K(dy) \, dx.\end{displaymath}$ (9)

If X is not only stationary but also isotropic, then K is isotropic as well, and it suffices to consider the reduced second-order moment function K(r) = K(b(o,r)). Note that $\lambda K(r)$ can be interpreted as the expected number of points in a circle with radius r and center at the typical point of the point process. In the case that X is a homogeneous Poisson process, (2.8) and (2.9) imply that $K(r) = \pi r^2$ for all $r \geq 0$ .

Assume now that a sample of X is observed in the window B, $0 < \nu(B) < \infty$ . Then, at first sight, a natural estimator of $\lambda^2 K(r)$ is

$\begin{displaymath} \tilde{\kappa} (r) = \frac{\mbox{number of point pairs in $B$\space separated by distance less than $r$}}{\nu(B)}.\end{displaymath}$

However, this estimator has an edge-bias since points which are close to the boundary of B may have near neighbors outside B which are not taken into account. The usual way one solves this problem is to consider an edge-corrected estimator of the form

$\begin{displaymath} \hat{\kappa}(r) = \frac{1}{\nu(B)} \sum_{x,y \in X \cap B} \bfind (0 < \vert x-y \vert \leq r) \, k(x,y)\end{displaymath}$

where k(x,y) is some weighting factor; see Ohser (1983), Ripley (1988), Stein (1991, 1993), Stoyan et al. (1995). For example, a variant of $\hat{\kappa}(r)$ considered in Ripley (1988) assumes that k^-1(x,y) is the proportion of the circumference of the circle with center x and radius |x-y| which is contained in B.

Now, an estimator for K(r) can be defined by $\hat{K}(r) = \hat{\kappa}(r)/ \hat{\lambda}^2$ where $\hat{\lambda}$ is an estimator for $\lambda$ .

Note that for practical purposes, it is often more convenient to consider the following L-function rather than K(r), where $L(r) = \sqrt{K(r)/ \pi}$ since L(r)=r in the Poisson case which is a simple linear function, see Figure 4. A natural test statistic for verifying the Poisson process hypothesis is then $T = \max_{r \leq r_0} \vert \hat{L}(r)-r\vert$ where $\hat{L}(r) = \sqrt{\hat{K}(r) / \pi}$ and r₀ is an upper bound for the interpoint distance r. If the value of T is too large, then the Poisson hypothesis is doubted.

**Figure 4:** Estimator $\hat{L}(r)$ and L(r)=r
$\begin{figure} \beginpicture \setlinear \setcoordinatesystem units <0.028cm,0.02... ...72 200 200.747 / \setdashes <1mm\gt \plot 0 0 200 200 / \endpicture\end{figure}$

If the data are observed in a rectangle, then a bivariate Cram'er-von Mises type of test can be applied for testing the Poisson hypothesis. Suppose that the points X_i have been observed in the window $[0,a] \times [0,b]$ ,a,b > 0, and are given in Cartesian coordinates $0 \leq X_i^{(1)} \leq a$ and $0 \leq X_i^{(2)} \leq b$ , then the difference $\omega^2$ of the empirical bivariate distribution function and the bivariate uniform distribution on $[0,a] \times [0,b]$ can be used. The difference $\omega^2$ has the computationally convenient representation

$\begin{displaymath} \omega^2 = \frac{1}{N} \sum\limits_{i,j=1}^N \, \{ 1 - \max ... ...\sum\limits_{i=1}^N \, (1-U_i^2)(1-V_i^2) \, + \, \frac{N}{9},\end{displaymath}$

where N is the number of observed points in the rectangle $[0,a] \times [0,b]$ , U_i = X_i⁽¹⁾/a and V_i = X_i⁽²⁾/b, $i=1,\ldots,N$ . Its limiting distribution ( $N \to \infty$ ) under the Poisson hypothesis was derived and tabulated by Durbin (1970). Zimmermann (1993) extended this result by taking into account that the realization of $\omega^2$ depends on which corner of the rectangle one chooses as the origin. His quantity $\overline{\omega}^2$ which overcomes this difficulty has the form

see Zimmermann (1990, 1993) for its limiting distribution under the Poisson hypothesis and selected percentiles of its distribution.

We finally remark that besides the tests based on exhaustively mapped data, there are tests which do not rely on a complete set of the locations of all points. These so-called quadrat count methods concern the case where the point pattern is sampled by counting the numbers of points in several (typically rectangular or quadratic) subregions of the observation window and where the distributional properties of the counts under the Poisson hypothesis are used; see Stoyan et al. (1995).

Next: Simulation Procedures Up: Homogeneous Poisson Processes Previous: Definition and Basic Properties

Andreas Frey
7/8/1998