The Many Worlds of Probability, Reality and Cognition: Appendix E. On the normal distribution

Statement:
The set of finite samples of a population is normally distributed.

Proof:
A finite population is described by a set of n elements. If we regard a sample as a subset of the population, then the set of all samples is the power set of the population set. The power set always has kⁿ elements, where k is the base of the number system used. Any integer-base system can be represented in base 2. So then the population and its power set can be represented in base 2. Whatever the distribution of the population, the power set always has a symmetric binomial distribution of member bit strings of lengths between 0 and n. (We can see this symmetry by the fact that every bit string has a mirror image, as in 01 and 10.) But, it has been proved that the binomial distribution converges to the normal distribution as n goes to infinity.

Hence, for n sufficiently large (generally put at 30 or greater), the power set of the population set is approximately normally distributed, which is to say that the finite set of samples of an arbitrary population is approximately normally distributed.

Note that if the population is infinite, the set of all samples corresponds to the power set of all positive integers, which is given as the Cantorian cardinal number 2^N,
which is equal to the cardinal, R, for the set of real numbers. That is, members of an infinite set of samples are one-to-one and onto with the set of real numbers. However, most reals are non-computable, and so most samples are inaccessible.

By the Church-Turing thesis, it is believed that any computation can, in theory, be done by a Turing machine [very basic computer program]. Of course, any output can be represented as a bit string, which represents a unique number. Hence, there is at least one Turing machine for every possible bit string output (inclusive of all the computable irrationals). Each TM is a program the elements of which are assigned integers. As the design elements are in a specific order, these integers are strung together to form the unique description number (DN) of an arbitrary TM. (The uniqueness of the description number can be assured, for example, by using a base 2 representation with two symbols, say r,s, reserving t for the space between base 2 integers and then regarding the string (as in, rrstrssts) as a unique base 3 number; it is also possible in theory to use order-preserving Goedel numbers.)

We see that the set of computables, each computable being tagged with a unique finite integer, is bijective with an infinite subset of the integers. It is known that such a subset is bijective with the entire set N. (However, a computable may correspond to more than one DN, especially with respect to irrationals, but this does not affect our result.) Now it is not clear that the description bit string for TM_X has a legal mirror image. A design that works in the "forward" direction may not work in the "backward" direction. That is, the mirror image bit string might not represent a TM that is able to begin issuing a printout tape in finite time.

However, it is clear that if TM_A prints out a bit string, then TM_A can be modified into TM_B to, at every step, swap a u for a v and a v for a u. A single sub-program -- call it tm_x -- should work for all effective TMs, as it simply modifies the output of each effective TM. That is, tm_x is simply fed the output for an effective TM, meaning DN_x is simply appended to DN_X. We will say that TM_B has DN_X,x. (To assure that a number exists, we place a binary point at the head of any string.)

(We needn't concern ourselves with the issue that TM_A's output can be recovered via a TM with DN_X,x,x.)

The ratio of the DN_X,x (a constant) to the set of all DNs of effective TMs is 0. And so as n gets larger, the difference between the DN _X,x and DN_X becomes vanishingly small. So that for "effective" DN_X of length m, there is an "effective" DN_X,x of length m such that n and m are approximately equal. This means that we have a nearly symmetrical binomial graph, with the set of DN_X's on one side of the median in descending order by height and the set of DN_X,x's on the other side of the median in descending order.

Hence, the set of DNs for computables has close to a binomial distribution when n is finite and a normal distribution when n is (enumerably) infinite. This shows that the set of descriptions of accessible samples of an enumerably infinite set, whatever its density function, is normally distributed.

Caveat i: This result requires acceptance of DN_X,x, while ignoring DN_X,x...x.

Caveat ii: If we ignore the infinitude of zeros that can be placed to the right of a base k rational found between 0 and 1, then each rational has a finite bit representation (assuming there is a number corresponding to an instruction that says "this cycle repeats indefinitely so the machine halts here"). This subset is bijective with the integers. We also have the subset of computable irrationals. A TM prints out a base k number at any step n, such that by the nth step the "halt" number is never computed. It is also bijective with the integers. So in this sense, we cannot say that the set of computables, which is the set of samples, is normally distributed (we cannot define a symmetric binomial curve with this data).

However, if we permit only samples of finite size, then the set of samples is normally distributed.

The Many Worlds of Probability, Reality and Cognition

Appendix E. On the normal distribution

No comments:

Post a Comment

Chapter 10

Report Abuse