Chapter 1

Where we are going
Types of probability
Types of randomness
Types of ignorance
Footnotes
https://manyworlds784.blogspot.com/p/footnotes.html

Where we are going
Controversy is a common thread running through the history of studies of probability assessments and statistical inference. It is my contention that so many opinions and characterizations exist because the concept roughly known as "probability" touches on the enigma of existence, with its attendant ambiguities (1).

A large and interesting literature has arisen concerning these controversies. I have sampled some of it but an exhaustive survey this is not. Nor is any formal program presented here. Rather, the idea is to try to come to grips with the assumptions on which rest the various forms of probabilistic thinking (2).

The extent of the disagreement and the many attempts to plug the holes make obvious that there is no consensus on the meaning of "probability," though students are generally taught some sort of synthesis, which may give an impression of consensus. And it is true that in general working scientists and statisticians adopt a pragmatic attitude, satisfying themselves with the vague thought that the axioms of Richard Von Mises [see below; use Control f with mises ax] and Andrey Kolmogorov and the work of Bruno De Finetti or of Emile Borel cover any bothersome meta-questions, which are seen as essentially trivial and irrelevant to the work at hand. That is to say, they tend to foster a shared assumption that science is solely based on testable ideas. However, a major tool of their work, probabilistic statistics, rests upon untestable assumptions.

Kolmogorov's axioms
http://mathworld.wolfram.com/KolmogorovsAxioms.html

De Finetti's views on probability
http://www.kent.ac.uk/secl/philosophy/jw/2009/deFinetti.pdf

On Borel's contributions
https://files.nyu.edu/eo1/public/Book-PDF/pChapterBBB.pdf

Not only do I intend to talk about these assumptions, but to also enter the no-go zone of "metaphysics." Though practicing scientists may prefer to avoid the "forbidden fruit" of ontology and epistemology found in this zone, they certainly will lack an important understanding of what they are doing if they decline to enter. The existence of the raging controversies tends to underscore the point that there is something more that needs understanding than is found in typical probability and statistics books.

Further, I intend to argue that the statistical conception of the world of appearances is only valid under certain conditions and that an unseen "noumenal" world is of great significance and implies a nonlinearity, in the asymmetric n-body sense, that current probability models cannot account for. My notion of a noumenal world is closer to Kant's view, or possibly Spencer's, than to that of the ancient Greeks -- with the proviso that modern logic and physics provide ways to discern by inference aspects of this hidden world.

In addition, I suggest that though a number of exhaustive formalizations of "probability theory" have been proffered, people tend to pilfer a few attractive concepts but otherwise don't take such formalizations very seriously -- though perhaps that assessment does not apply to pure logicians. Similarly, I wonder whether talk of such things as "the topology of a field" adds much to an understanding of probability and its role in science (3). Certainly, few scientists bother with such background considerations.

In the end, we find that the value of a probabilistic method is itself probabilistic. If one is satisfied that the success rate accords with experience, one tends to accept the method. The more so if a group corroborates that assessment.

The usual axioms of probability found in standard statistics textbooks are axioms for a reason: There is no assurance that reality will in fact operate "probabilistically," which is to say we cannot be sure that the definition of randomness we use won't somehow be undermined.

Standard axioms
http://mathworld.wolfram.com/ProbabilityAxioms.html

This is not a trivial matter. How, for example, do we propose to use probability to cope with "backward running time" scenarios that occur in modern physics? Yes, we may have at hand a means of assigning, say, probability amplitudes, but if the cosmos doesn't always work according to our standard assumptions, then we have to question whether what some call a "universe of chance" is sufficient as a model not only of the cosmos at large, but of the "near reality" of our everyday existence (4).

And, as is so often the case in such discussions, a number of definitions are entangled, and hence sometimes we simply have to get the gist (5) of a discussion until certain terms are clarified, assuming they are.

Though we will discuss the normal curve and touch lightly on other distributions, the reader needn't worry that he or she will be subjected to much in the way of intricacies of mathematical statistics. All methods of inferential statistics rest on assumptions concerning probability and randomness, and they will be our main areas of concern.

Types of probability
Rudolf Carnap (6), in an attempt to resolve the controversy between Keynesian subjectivists and Neyman-Pearson frequentists, offered two types of probability: probability1, giving degrees of confidence or "weight of evidence;" and probability2: giving "relative frequency in the long run." In my view, Carnap's two forms are insufficient.

  In my classification, we have:
Probability1: Classical, as in proportion of black to white balls in an urn.

Probability2: Frequentist, as in trials of coin flips.

Probability3: Bayesian (sometimes called the probability of causes), as in determining the probability that an event happened, given an initial probability of some other event.

Probability4: Degree of confidence, as in expert opinion. This category is often subsumed under Probability3.

Probability5: "Objective" Bayesian degree of confidence, in which an expert opinion goes hand in hand with relevant frequency ratios -- whether the relative frequency forms part of the initial estimate or whether it arrives in the form of new information.

Probability6: "Subjective" Bayesian degree of confidence, as espoused by De Finetti later in life, whereby not only does probability, in some physical sense, not exist, but degree of belief is essentially a matter of individual perception.

Probability7: Ordinary a priori probability, often termed propensity, associated, for example, with gambling systems. The biases are built into the game based on ordinary frequency logic and, possibly, based on the advance testing of equipment.

Probability8: The propensity of Karl Popper, which he saw as a fundamental physical property that has as much right to exist as, but is distinct from, a force.

Probability9: Standard quantum propensity, in which experimentally determined a priori frequencies for particle detection have been bolstered by the highly accurate quantum formalism.

Probability10: Information theory probability, which is "ordinary" probability; however, subtleties enter into the elucidation of information entropy as distinguished from physical entropy. In terms of ordinary propensity, information theory accounts for the structural constraints, which might be termed advance information. These constraints reduce the new information, sometimes called surprisal value. I' = I - Ic, where I is the new information, I the total information and Ic the structural information.

We will review these concepts to a greater or lesser degree as we proceed. Others have come up with different categorizations. From Bruce Hajek

Hajek on interpretations of probability
http://plato.stanford.edu/entries/probability-interpret/

we have three main concepts in probability:

1. Quasi-logical: "meant to measure evidential support relations." As in: "In light of the relevant seismological and geological data, it is probable that California will experience a major earthquake this decade."

2. Degree of confidence: As in: "It is probable that it will rain in Canberra."

3. An objective concept: As in: "A particular radium atom will probably decay within 10,000 years."

Ian Hacking wrote that chance began to be tamed in the 19th century when a lot of empirical data were published, primarily by government agencies.

"The published facts about deviancies [variation], and the consequent development of the social sciences, led to the erosion of determinism, so that by the end of the century C.S. Peirce could say we live in a universe of chance."

Hacking saw probability as having two aspects. "It is connected with the degrees of belief warranted by the evidence, and it is connected with the tendencies, displayed by some chance devices, to produce stable relative frequencies" (7).

Another type of probability might be called "nonlinear probability," but I demur to include this as a specific type of probability as this concept essentially falls under the rubric of conditional probability.

By "nonlinear" probability I mean a chain of conditional probabilities that includes a feedback process. If we look at any feedback control system, we see that the output is partly dependent on itself. Many such systems, though not all, are expressed by nonlinear differential equations.

So the probability of a molecule being at point X is influenced by the probabilities of all other molecules. Now, by assumption of randomness of many force vectors, the probabilities in a flowing stream tend to cancel, leaving a constant. However, in a negative feedback system the constants must be different for the main output and the backward-flowing control stream. So we see that in some sense probabilities are "influencing themselves." In a positive feedback loop, the "self-referencing" spirals toward some physical upper bound and again we see that probabilities, in a manner of speaking, are self-conditional.

The feedback system under consideration is the human mind's interaction with the noumenal world, this interaction producing the phenomenal world. For more detail, see sections on the noumenal world (Part VI, see sidebar) and my paper Toward (link in sidebar).

Types of randomness
When discussing probability, we need to think about the complement concept of randomness, which is an assumption necessary for independence of events.

My categories of randomness:
Randomness1: Insufficient computing power leaves much unpredictable. This is seen in nonlinear differential equations, chaos theory and in cases where small truncation differences yield widely diverging trajectories (Lorenz's butterfly effect). In computer lingo, such calculations are known as "hard" computations whose computational work increases exponentially.

Randomness2: Kolmogorov/Chaitin randomness, which is closely related to randomness1. The computational complexity is measured by how close to 1 is the ratio of the algorithm information and its input information versus the output information. (If we reduce algorithms to Turing machines, then the algorithmic and the input information are strung together in a single binary string.)

Chaitin-Kolmogorov complexity
https://www.princeton.edu/~achaney/tmve/wiki100k/docs/Kolmogorov_complexity.html

Randomness3: Randomness associated with probability9 seen in quantum effects. Randomness3 is what most would regard as intrinsic randomness. Within the constraints of the Heisenberg Uncertainty Principle one cannot, in principle, predict exactly the occurrence of two properties of a quantum detection event.

Randomness4: Randomness associated with Probability8, the propensity of Popper. It appears to be a mix of randomness3 and randomness1.

Randomness5: The imposition of "willful ignorance" in order to guard against observer bias in a frequency-based experiment.

Randomness6: Most real numbers are not computable. They are inferred in ZFC set theory, inhabiting their own noumenal world. They are so wild that one must consider them to be utterly random. A way to possibly notionally write such a string would be to tie selection of each subsequent digit to a quantum detector. One could never be sure that such a string would not actually randomly find itself among the computables, though it can be claimed that there is a probability 1 (virtual certainty) that it would in fact be among the non-computables. Such a binary string would also, with probability 1, contain infinities of every possible finite substring. These sorts of probability claims are open to question as to whether they represent true knowledge, though for Platonically inclined mathematicians, they are quite acceptable. See my post
A fractal that orders the reals
http://uklidd.blogspot.com/2013/11/a-fractal-for-reals-pinpoints-axiom-of.html

Benoit Mandelbrot advocated three states of randomness: mild, wild and slow.

By mild, he meant randomness that accords easily with the bell-curved normal distribution. By wild, he meant curves that fluctuate sharply at aperiodic intervals. By slow, he meant a curve that looks relatively smooth (and accords well with the normal distribution) but steadily progresses toward a crisis point that he describes as equivalent to a physical phase shift, which then goes into the wild state.

One can see for example that in the iterative logistic equation, as the initial value increases asymptotically toward 4, we go from simple periodicity to intermittent finite intervals of chaos alternating with periodicity, but at 4, the crisis point, all higher reals produce escape the logistic graph. The Feigenbaum constant is a measure of the tendency toward chaos (trapped aperiodic orbits) and itself might be viewed as a crisis point.

Another way to think of this is in terms of shot noise. Shot noise may increase as we change variables. So the graph of the information stream will show disjoint spikes with amplitudes that indicates the spikes can't be part of the intended message; the spikes may gradually increase in number, until we get to a crisis point, from whence there is more noise than message. We also have the transition from laminar flow to turbulence under various constraints. The transition can be of "short" or "long" duration, where we have a mixture of turbulent vortices with essentially laminar flow.

Mandelbrot wished to express his concepts in terms of fractals, which is another way of saying power laws. Logarithmic and exponential curves generally have points near origin which, when subjectively considered, seem to mark a distinction between the routine change an d bizarre change. Depending on what is being measured, that distinction might occur at 210 or 210.871, or in other words arbitrarily. Or some objective measure can mark the crisis point, such as when noise equals message in terms of bits.

Wikipedia refines Mandelbrot's grading and gives seven states of randomness:

Proper mild randomness: short-run portioning is even for N=2, e.g. the normal distribution
Borderline mild randomness: short-run portioning is concentrated for N=2, but eventually becomes even as N grows, e.g. the exponential distribution with λ=1 Slow randomness with finite delocalized moments: scale factor increases faster than q but no faster than , w<1
Slow randomness with finite and localized moments: scale factor increases faster than any power of q, but remains finite, e.g. the lognormal distribution
Pre-wild randomness: scale factor becomes infinite for q>2 , e.g. the Pareto distribution with α=2.5
Wild randomness: infinite second moment, but finite moment of some positive order, e.g. the Pareto distribution with α=1.5
Extreme randomness: all moments are infinite, e.g. the Pareto distribution with α=1

I have made no attempt to make the numbering of the categories for probability correspond with that for randomness. The types of probability I present do not carry one and only one type of randomness. How they relate to randomness is a supple issue to be discussed as we go along.

In this respect, we shall also examine the issues of ignorance of a deterministic output versus ignorance of an indeterministic (quantum) output.

Types of ignorance
Is your ignorance of what outcome will occur in an experiment utterly subjective or are there physical causes for the ignorance, as in the propensity notion? Each part of that question assumes a strict demarcation between mind and external environment in an experiment, a simplifying assumption in which feedback is neglected (but can it be, really?).

Much of the difficulty in discerning the "meaning of probability" arose with the development of quantum mechanics, which, as Jan von Plato notes, "posed an old question anew, but made it more difficult than ever before to dismiss it: is probability not also to be viewed as ontic, i.e., as a feature of reality, rather than exclusively as epistemic, i.e., as a feature characterizing our state of knowledge?" (8)

The scenario leading the next chapter gives a glimpse of some of the issues which will be followed up further along in this book.


Go to Chapter 2 HERE.

No comments:

Post a Comment

Chapter 10

The importance of brain teasers The Monty Hall problem The scenario's opener: The contestant is shown three curtains and told that behin...