Equidistributed sequence

In mathematics, a sequence (s₁, s₂, s₃, ...) of real numbers is said to be equidistributed, or uniformly distributed, if the proportion of terms falling in a subinterval is proportional to the length of that subinterval. Such sequences are studied in Diophantine approximation theory and have applications to Monte Carlo integration.

Definition

A sequence (s₁, s₂, s₃, ...) of real numbers is said to be equidistributed on a non-degenerate interval [a, b] if for every subinterval [c, d ] of [a, b] we have

\lim _{n\to \infty }{\left|\{\,s_{1},\dots ,s_{n}\,\}\cap [c,d]\right| \over n}={d-c \over b-a}.

(Here, the notation |{s₁,...,s_n} ∩ [c, d ]| denotes the number of elements, out of the first n elements of the sequence, that are between c and d.)

For example, if a sequence is equidistributed in [0, 2], since the interval [0.5, 0.9] occupies 1/5 of the length of the interval [0, 2], as n becomes large, the proportion of the first n members of the sequence which fall between 0.5 and 0.9 must approach 1/5. Loosely speaking, one could say that each member of the sequence is equally likely to fall anywhere in its range. However, this is not to say that (s_n) is a sequence of random variables; rather, it is a determinate sequence of real numbers.

Discrepancy

We define the discrepancy D_N for a sequence (s₁, s₂, s₃, ...) with respect to the interval [a, b] as

D_{N}=\sup _{a\leq c\leq d\leq b}\left\vert {\frac {\left|\{\,s_{1},\dots ,s_{N}\,\}\cap [c,d]\right|}{N}}-{\frac {d-c}{b-a}}\right\vert .

A sequence is thus equidistributed if the discrepancy D_N tends to zero as N tends to infinity.

Equidistribution is a rather weak criterion to express the fact that a sequence fills the segment leaving no gaps. For example, the drawings of a random variable uniform over a segment will be equidistributed in the segment, but there will be large gaps compared to a sequence which first enumerates multiples of ε in the segment, for some small ε, in an appropriately chosen way, and then continues to do this for smaller and smaller values of ε. For stronger criteria and for constructions of sequences that are more evenly distributed, see low-discrepancy sequence.

Riemann integral criterion for equidistribution

Recall that if f is a function having a Riemann integral in the interval [a, b], then its integral is the limit of Riemann sums taken by sampling the function f in a set of points chosen from a fine partition of the interval. Therefore, if some sequence is equidistributed in [a, b], it is expected that this sequence can be used to calculate the integral of a Riemann-integrable function. This leads to the following criterion^[1] for an equidistributed sequence:

Suppose (s₁, s₂, s₃, ...) is a sequence contained in the interval [a, b]. Then the following conditions are equivalent:

The sequence is equidistributed on [a, b].
For every Riemann-integrable (complex-valued) function f : [a, b] → $\mathbb {C}$ , the following limit holds:

\lim _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f\left(s_{n}\right)={\frac {1}{b-a}}\int _{a}^{b}f(x)\,dx

Proof
First note that the definition of an equidistributed sequence is equivalent to the integral criterion whenever f is the indicator function of an interval: If f = 1_{[c, d]}, then the left hand side is the proportion of points of the sequence falling in the interval [c, d], and the right hand side is exactly $\textstyle {\frac {d-c}{b-a}}.$ This means 2 ⇒ 1 (since indicator functions are Riemann-integrable), and 1 ⇒ 2 for f being an indicator function of an interval. It remains to assume that the integral criterion holds for indicator functions and prove that it holds for general Riemann-integrable functions as well. Note that both sides of the integral criterion equation are linear in f, and therefore the criterion holds for linear combinations of interval indicators, that is, step functions. To show it holds for f being a general Riemann-integrable function, first assume f is real-valued. Then by using Darboux's definition of the integral, we have for every ε > 0 two step functions f₁ and f₂ such that f₁ ≤ f ≤ f₂ and $\textstyle \int _{a}^{b}(f_{2}(x)-f_{1}(x))\,dx\leq \varepsilon (b-a).$ Notice that: ${\frac {1}{b-a}}\int _{a}^{b}f_{1}(x)\,dx=\lim _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f_{1}(s_{n})\leq \liminf _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f(s_{n})$ ${\frac {1}{b-a}}\int _{a}^{b}f_{2}(x)\,dx=\lim _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f_{2}(s_{n})\geq \limsup _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f(s_{n})$ By subtracting, we see that the limit superior and limit inferior of $\textstyle {\frac {1}{N}}\sum _{n=1}^{N}f(s_{n})$ differ by at most ε. Since ε is arbitrary, we have the existence of the limit, and by Darboux's definition of the integral, it is the correct limit. Finally, for complex-valued Riemann-integrable functions, the result follows again from linearity, and from the fact that every such function can be written as f = u + vi, where u, v are real-valued and Riemann-integrable. ∎

Proof

First note that the definition of an equidistributed sequence is equivalent to the integral criterion whenever f is the indicator function of an interval: If f = 1_{[c, d]}, then the left hand side is the proportion of points of the sequence falling in the interval [c, d], and the right hand side is exactly

\textstyle {\frac {d-c}{b-a}}.

This means 2 ⇒ 1 (since indicator functions are Riemann-integrable), and 1 ⇒ 2 for f being an indicator function of an interval. It remains to assume that the integral criterion holds for indicator functions and prove that it holds for general Riemann-integrable functions as well.

Note that both sides of the integral criterion equation are linear in f, and therefore the criterion holds for linear combinations of interval indicators, that is, step functions.

To show it holds for f being a general Riemann-integrable function, first assume f is real-valued. Then by using Darboux's definition of the integral, we have for every ε > 0 two step functions f₁ and f₂ such that f₁ ≤ f ≤ f₂ and $\textstyle \int _{a}^{b}(f_{2}(x)-f_{1}(x))\,dx\leq \varepsilon (b-a).$ Notice that:

{\frac {1}{b-a}}\int _{a}^{b}f_{1}(x)\,dx=\lim _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f_{1}(s_{n})\leq \liminf _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f(s_{n})

{\frac {1}{b-a}}\int _{a}^{b}f_{2}(x)\,dx=\lim _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f_{2}(s_{n})\geq \limsup _{N\to \infty }{\frac {1}{N}}\sum _{n=1}^{N}f(s_{n})

By subtracting, we see that the limit superior and limit inferior of $\textstyle {\frac {1}{N}}\sum _{n=1}^{N}f(s_{n})$ differ by at most ε. Since ε is arbitrary, we have the existence of the limit, and by Darboux's definition of the integral, it is the correct limit.

Finally, for complex-valued Riemann-integrable functions, the result follows again from linearity, and from the fact that every such function can be written as f = u + vi, where u, v are real-valued and Riemann-integrable. ∎

This criterion leads to the idea of Monte-Carlo integration, where integrals are computed by sampling the function over a sequence of random variables equidistributed in the interval.

It is not possible to generalize the integral criterion to a class of functions bigger than just the Riemann-integrable ones. For example, if the Lebesgue integral is considered and f is taken to be in L¹, then this criterion fails. As a counterexample, take f to be the indicator function of some equidistributed sequence. Then in the criterion, the left hand side is always 1, whereas the right hand side is zero, because the sequence is countable, so f is zero almost everywhere.

In fact, the de Bruijn–Post Theorem states the converse of the above criterion: If f is a function such that the criterion above holds for any equidistributed sequence in [a, b], then f is Riemann-integrable in [a, b].^[2]

Equidistribution modulo 1

A sequence (a₁, a₂, a₃, ...) of real numbers is said to be equidistributed modulo 1 or uniformly distributed modulo 1 if the sequence of the fractional parts of a_n, denoted by (a_n) or by a_n − ⌊a_n⌋, is equidistributed in the interval [0, 1].

Examples

The equidistribution theorem: The sequence of all multiples of an irrational α,

0, α, 2α, 3α, 4α, ...

is equidistributed modulo 1.^[3]

More generally, if p is a polynomial with at least one coefficient other than the constant term irrational then the sequence p(n) is uniformly distributed modulo 1.

This was proven by Weyl and is an application of van der Corput's difference theorem.^[4]

The sequence log(n) is not uniformly distributed modulo 1.^[3] This fact is related to Benford's law.
The sequence of all multiples of an irrational α by successive prime numbers,

2α, 3α, 5α, 7α, 11α, ...

is equidistributed modulo 1. This is a famous theorem of analytic number theory, published by I. M. Vinogradov in 1948.^[5]

The van der Corput sequence is equidistributed.^[6]

Weyl's criterion

Weyl's criterion states that the sequence a_n is equidistributed modulo 1 if and only if for all non-zero integers ℓ,

\lim _{n\to \infty }{\frac {1}{n}}\sum _{j=1}^{n}e^{2\pi i\ell a_{j}}=0.

The criterion is named after, and was first formulated by, Hermann Weyl.^[7] It allows equidistribution questions to be reduced to bounds on exponential sums, a fundamental and general method.

Sketch of proof
If the sequence is equidistributed modulo 1, then we can apply the Riemann integral criterion (described above) on the function $\textstyle f(x)=e^{2\pi i\ell x},$ which has integral zero on the interval [0, 1]. This gives Weyl's criterion immediately. Conversely, suppose Weyl's criterion holds. Then the Riemann integral criterion holds for functions f as above, and by linearity of the criterion, it holds for f being any trigonometric polynomial. By the Stone–Weierstrass theorem and an approximation argument, this extends to any continuous function f. Finally, let f be the indicator function of an interval. It is possible to bound f from above and below by two continuous functions on the interval, whose integrals differ by an arbitrary ε. By an argument similar to the proof of the Riemann integral criterion, it is possible to extend the result to any interval indicator function f, thereby proving equidistribution modulo 1 of the given sequence. ∎

Sketch of proof

If the sequence is equidistributed modulo 1, then we can apply the Riemann integral criterion (described above) on the function

\textstyle f(x)=e^{2\pi i\ell x},

which has integral zero on the interval [0, 1]. This gives Weyl's criterion immediately.

Conversely, suppose Weyl's criterion holds. Then the Riemann integral criterion holds for functions f as above, and by linearity of the criterion, it holds for f being any trigonometric polynomial. By the Stone–Weierstrass theorem and an approximation argument, this extends to any continuous function f.

Finally, let f be the indicator function of an interval. It is possible to bound f from above and below by two continuous functions on the interval, whose integrals differ by an arbitrary ε. By an argument similar to the proof of the Riemann integral criterion, it is possible to extend the result to any interval indicator function f, thereby proving equidistribution modulo 1 of the given sequence. ∎

Generalizations

A quantitative form of Weyl's criterion is given by the Erdős–Turán inequality.
Weyl's criterion extends naturally to higher dimensions, assuming the natural generalization of the definition of equidistribution modulo 1:

The sequence v_n of vectors in R^k is equidistributed modulo 1 if and only if for any non-zero vector ℓ ∈ Z^k,

\lim _{n\to \infty }{\frac {1}{n}}\sum _{j=0}^{n-1}e^{2\pi i\ell \cdot v_{j}}=0.

Example of usage

Weyl's criterion can be used to easily prove the equidistribution theorem, stating that the sequence of multiples 0, α, 2α, 3α, ... of some real number α is equidistributed modulo 1 if and only if α is irrational.^[3]

Suppose α is irrational and denote our sequence by a_j = jα (where j starts from 0, to simplify the formula later). Let ℓ ≠ 0 be an integer. Since α is irrational, ℓα can never be an integer, so ${\textstyle e^{2\pi i\ell \alpha }}$ can never be 1. Using the formula for the sum of a finite geometric series,

\left|\sum _{j=0}^{n-1}e^{2\pi i\ell j\alpha }\right|=\left|\sum _{j=0}^{n-1}\left(e^{2\pi i\ell \alpha }\right)^{j}\right|=\left|{\frac {1-e^{2\pi i\ell n\alpha }}{1-e^{2\pi i\ell \alpha }}}\right|\leq {\frac {2}{\left|1-e^{2\pi i\ell \alpha }\right|}},

a finite bound that does not depend on n. Therefore, after dividing by n and letting n tend to infinity, the left hand side tends to zero, and Weyl's criterion is satisfied.

Conversely, notice that if α is rational then this sequence is not equidistributed modulo 1, because there are only a finite number of options for the fractional part of a_j = jα.

Complete uniform distribution

A sequence $(a_{1},a_{2},\dots )$ of real numbers is said to be k-uniformly distributed mod 1 if not only the sequence of fractional parts $a_{n}':=a_{n}-[a_{n}]$ is uniformly distributed in $[0,1]$ but also the sequence $(b_{1},b_{2},\dots )$ , where $b_{n}$ is defined as $b_{n}:=(a'_{n+1},\dots ,a'_{n+k})\in [0,1]^{k}$ , is uniformly distributed in $[0,1]^{k}$ .

A sequence $(a_{1},a_{2},\dots )$ of real numbers is said to be completely uniformly distributed mod 1 it is $k$ -uniformly distributed for each natural number $k\geq 1$ .

For example, the sequence $(\alpha ,2\alpha ,\dots )$ is uniformly distributed mod 1 (or 1-uniformly distributed) for any irrational number $\alpha$ , but is never even 2-uniformly distributed. In contrast, the sequence $(\alpha ,\alpha ^{2},\alpha ^{3},\dots )$ is completely uniformly distributed for almost all $\alpha >1$ (i.e., for all $\alpha$ except for a set of measure 0).

van der Corput's difference theorem

A theorem of Johannes van der Corput^[8] states that if for each h the sequence s_n+h − s_n is uniformly distributed modulo 1, then so is s_n.^[9]^[10]^[11]

A van der Corput set is a set H of integers such that if for each h in H the sequence s_n+h − s_n is uniformly distributed modulo 1, then so is s_n.^[10]^[11]

Metric theorems

Metric theorems describe the behaviour of a parametrised sequence for almost all values of some parameter α: that is, for values of α not lying in some exceptional set of Lebesgue measure zero.

For any sequence of distinct integers b_n, the sequence (b_nα) is equidistributed mod 1 for almost all values of α.^[12]
The sequence (αⁿ) is equidistributed mod 1 for almost all values of α > 1.^[13]

It is not known whether the sequences (eⁿ) or (πⁿ) are equidistributed mod 1. However it is known that the sequence (αⁿ) is not equidistributed mod 1 if α is a PV number.

Well-distributed sequence

A sequence (s₁, s₂, s₃, ...) of real numbers is said to be well-distributed on [a, b] if for any subinterval [c, d ] of [a, b] we have

\lim _{n\to \infty }{\left|\{\,s_{k+1},\dots ,s_{k+n}\,\}\cap [c,d]\right| \over n}={d-c \over b-a}

uniformly in k. Clearly every well-distributed sequence is uniformly distributed, but the converse does not hold. The definition of well-distributed modulo 1 is analogous.

Sequences equidistributed with respect to an arbitrary measure

For an arbitrary probability measure space $(X,\mu )$ , a sequence of points $(x_{n})$ is said to be equidistributed with respect to $\mu$ if the mean of point measures converges weakly to $\mu$ :^[14]

{\frac {\sum _{k=1}^{n}\delta _{x_{k}}}{n}}\Rightarrow \mu \ .

In any Borel probability measure on a separable, metrizable space, there exists an equidistributed sequence with respect to the measure; indeed, this follows immediately from the fact that such a space is standard.

The general phenomenon of equidistribution comes up a lot for dynamical systems associated with Lie groups, for example in Margulis' solution to the Oppenheim conjecture.

Notes

^ Kuipers & Niederreiter (2006) pp. 2–3
^ http://math.uga.edu/~pete/udnotes.pdf, Theorem 8
^ ^a ^b ^c Kuipers & Niederreiter (2006) p. 8
^ Kuipers & Niederreiter (2006) p. 27
^ Kuipers & Niederreiter (2006) p. 129
^ Kuipers & Niederreiter (2006) p. 127
^ Weyl, H. (September 1916). "Über die Gleichverteilung von Zahlen mod. Eins" [On the distribution of numbers modulo one] (PDF). Math. Ann. (in German). 77 (3): 313–352. doi:10.1007/BF01475864. S2CID 123470919.
^ van der Corput, J. (1931), "Diophantische Ungleichungen. I. Zur Gleichverteilung Modulo Eins", Acta Mathematica, 56, Springer Netherlands: 373–456, doi:10.1007/BF02545780, ISSN 0001-5962, JFM 57.0230.05, Zbl 0001.20102
^ Kuipers & Niederreiter (2006) p. 26
^ ^a ^b Montgomery (1994) p. 18
^ ^a ^b Montgomery, Hugh L. (2001). "Harmonic analysis as found in analytic number theory" (PDF). In Byrnes, James S. (ed.). Twentieth century harmonic analysis–a celebration. Proceedings of the NATO Advanced Study Institute, Il Ciocco, Italy, July 2–15, 2000. NATO Sci. Ser. II, Math. Phys. Chem. Vol. 33. Dordrecht: Kluwer Academic Publishers. pp. 271–293. doi:10.1007/978-94-010-0662-0_13. ISBN 978-0-7923-7169-4. Zbl 1001.11001.
^ See Bernstein, Felix (1911), "Über eine Anwendung der Mengenlehre auf ein aus der Theorie der säkularen Störungen herrührendes Problem", Mathematische Annalen, 71 (3): 417–439, doi:10.1007/BF01456856, S2CID 119558177.
^ Koksma, J. F. (1935), "Ein mengentheoretischer Satz über die Gleichverteilung modulo Eins", Compositio Mathematica, 2: 250–258, JFM 61.0205.01, Zbl 0012.01401
^ Kuipers & Niederreiter (2006) p. 171

References

Kuipers, L.; Niederreiter, H. (2006) [1974]. Uniform Distribution of Sequences. Dover Publications. ISBN 0-486-45019-8.
Kuipers, L.; Niederreiter, H. (1974). Uniform Distribution of Sequences. John Wiley & Sons Inc. ISBN 0-471-51045-9. Zbl 0281.10001.
Montgomery, Hugh L. (1994). Ten lectures on the interface between analytic number theory and harmonic analysis. Regional Conference Series in Mathematics. Vol. 84. Providence, RI: American Mathematical Society. ISBN 0-8218-0737-4. Zbl 0814.11001.