Next: Transformation of Uniformly Distributed Up: Generation of Pseudo-Random Numbers Previous: Linear Congruential Generators   Contents

### Statistical Tests

• In literature numerous statistical significance tests are discussed in order to investigate characteristics of random number generators; see e.g. G.S. Fishman (1996) Monte Carlo: Concepts, Algorithms and Applications, Springer, New York.

• We only recall two such tests
• which are important for investigating characteristics of linear congruential generators (and other random number generators)
• and have already been mentioned in the courses ,,Statistik I'' and ,,Statistik II'', respectively.

• Pearson's -goodness of fit test (see Section 5.2 of the lecture notes for ,,Statistik II'') is used to check
• if the generated pseudo-random numbers can be regarded as realizations of uniformly distributed random variables
• and if we may assume the independence of these random variables.

• Another method for the generation of sequences of numbers having desirable characteristics
• is based on minimizing the Kolmogorov distance (see Section 1.5 of ,,Statistik I'' and Section 5.1 of ,,Statistics II'', respectively)

between the empirical distribution function of the ,,sample'' and the distribution function of the uniform distribution on for every natural number .
• In literature this procedure is referred to as Quasi-Monte-Carlo-Method; see e.g. H. Niederreiter (1992) Random Number Generation and Quasi-Monte-Carlo Methods, SIAM, Philadelphia.

1. -goodness of fit test of uniform distribution

The following test is considered in order to check if the pseudo-random numbers

• can be regarded as realizations of independent sampling variables that are uniformly distributed on the interval .
• The interval is divided in subintervals of equal length and
• we consider the -dimensional (hypothetical) vector of parameters and
• the test statistic where

and denotes the number of pseudo-random numbers in the interval .
• If the sampling variables are independent and uniformly distributed on the interval , by Theorem 5.5 from ,,Statistik II''
• the test statistic is asymptotically distributed.
• Thus, for sufficiently large the hypothesis is rejected if

where denotes the -quantile of the distribution with degrees of freedom.

• We will illustrate this test by the following numerical example. For , and we want to check if
• the hypothesis that the sampling variables are uniformly distributed is conformable with a sample of pseudo-random numbers. The sample has the following vector of class frequencies:

• In this case we obtain and hence

• Thus, the hypothesis of a uniform distribution on is not rejected.

Remarks

• As a generalization of the -goodness of fit test for checking the uniform distribution of some sample variables one can also check
• if for a given natural number (e.g. or ) the pseudo-random vectors can be regarded
• as realizations of independent random vectors that are uniformly distributed on .
• For this purpose the unit cube is divided into smaller cubes of equal size,
• which are of the form .
• Furthermore, we consider the -dimensional (hypothetical) vector of parameters and
• the test statistic where

and . Notice that denotes the number of pseudo-random vectors in .

2. Run Test

There are a number of other significance tests allowing to evaluate the quality of random number generators. In particular it can be verified

• if the generated pseudo-random numbers can be regarded as realizations of independent random variables having a certain distribution. In our case we consider the hypothesis of a uniform distribution on .
• The following run test checks in particular
• if the independence assumption for the sampling variables is reflected sufficiently well by the pseudo-random numbers .
• This is done by analyzing the lengths of monotonically increasing subsequences, also called runs, within the sequence of pseudo-random numbers.
• For this purpose we define the random variables by the recursion formula

 (5)

where .
• The random variables where

 and (6)

are called the runs of the sequence .

The significance test that will be constructed is based on the following property of the runs .

Theorem 3.3   The random variables introduced in are independent and identically distributed such that

 (7)

if the random variables are independent and uniformly distributed on .

Proof

• Let be independent and uniformly distributed on .
• Then for all and for arbitrary natural numbers , we get that

• This implies that the runs are independent and identically distributed.
• Furthermore, an induction argument shows that for arbitrary and

 (8)

• For , equation (8) obviously holds. By the formula of total probability we obtain

where the last equality is a consequence of the independence and -uniform distribution of .
• Assume now that (8) is true for some . Then

where the second but one equality uses the induction hypothesis.
• This shows (8) for any .
• Moreover, by (8) we can conclude that for any

Remarks

• Let us assume that sufficiently many pseudo-random numbers have been generated that are resulting in the runs according to (5) and (6).
• We choose pairwise disjoint intervals on the positive real axis such that
• the probabilities

are almost equal.
• For these probabilities we consider the -dimensional (hypothetical) vector
and
• the test statistic where

and denotes the number of run lengths belonging to class .
• According to Theorem 3.3 for large the hypothesis will be rejected if . Note that this requires the generation a sufficiently large number of pseudo-random numbers .

Next: Transformation of Uniformly Distributed Up: Generation of Pseudo-Random Numbers Previous: Linear Congruential Generators   Contents
Ursa Pantle 2006-07-20