The most amazing sporting event of all time?

Today’s news sources are talking about Leicester City’s winning the Premier League as a sort of miracle. The bookies’ initially-offered odds of “5000 to 1” has morphed into a supposedly scientific/mathematical measure of probability — we are being told that Leicester City had “a slim chance of only 1 in 5000” of winning the Premier League. Yet amazingly, they did win it! We are given to believe that “1 in 5000” is a numerical measure of how surprised we should be at the fact that they did in fact win.

That is ridiculous. The Premier League consists of 20 teams, chosen specifically for their ability to beat other teams. Suppose instead of the Premier League on its own, we imagine a much larger competitive free-for-all containing the Premier League, plus the First Division below them, plus the Second Division below them, and so on, till we have 5000 teams altogether playing against each other.

If we knew nothing whatsoever about any given team, in that situation we might assign a “probability of only 1 in 5000” that it would win. In other words, if we picked a team randomly from the 5000, and did so repeatedly, then in the long run we would pick the winning team about once in every 5000 attempts to do so.

But now suppose we are told something about a given team: that it is in the top 20. That should make us raise our numerical assessment of its chances of winning the free-for-all. If we were further told that a team in the top 20 never loses to a team in the bottom 4980, we would very significantly raise our estimate of its chances of winning the free-for-all. It would be something similar to playing the Monty Hall game, except that instead of one out of three available doors being ruled out, 4980 out of 5000 available doors are ruled out.

But that, in effect, is what limiting the free-for-all to only the Premier League does. It means that if we know nothing at all about a team, the repeated act of picking one out randomly in the hope of choosing the winner would be successful much more often than 1 in 5000 times.

To lower the “chances of winning” in the face of further knowledge about a given team is to introduce capricious, subjective factors that cannot be relied on to make statistical judgements of relative frequency. They involve unrepeatable events or events that are not statistically lawlike, and so cannot be reliably extrapolated from. All we can do is guess about credibility here.

Casinos make money reliably because the behaviour of dice, cards, rotating cylinders etc. is statistically lawlike. For example, we know that in the long run about one sixth of rolls of pairs of dice will be doubles. But the behaviour of football teams in the Premier League is not at all lawlike. Bookies have to use numbers in their line of work, but let no one think these numbers correspond to measures of anything real or significant.

I suggest that we should sharply distinguish statistical relative frequency and subjective judgements of credibility. Numbers measure the former, but their presence is a will o’ the wisp when we are dealing with the latter.

Why holists distrust expert opinion

The “default” way of thinking about evidence is often called foundationalism. Foundationalists think that most of our everyday beliefs about the world are justified by “resting on a foundation” of privileged or more certain beliefs—typically, beliefs about conscious experience, raw feels, or “sense data”. In science, foundationalists typically suppose that a theory in a specialised field is a sort of edifice that is justified by resting on the carefully collected observational “data” of that specific field. This idea is partly inspired by mathematics, in which theorems really do rest on (i.e. are derivable from and implied by) axioms. The question is, should we take mathematics as our model of empirical knowledge?

Opposed to foundationalism is holism. Holists think that everyday beliefs are justified by belonging to a larger belief system. Individual beliefs do not stand or fall on their own, but meet the evidence as a whole, and it’s the way that whole “hangs together” that justifies the entire system. In science, holists typically suppose that theories consist of hypotheses, which are justified by meshing smoothly with other hypotheses, often from disparate fields. This is a matter of how much a theory explains, how reliably it predicts unforeseen observable events, how “virtuous” it seems when we consider its conservatism, modestly, simplicity, generality, fecundity, and so on. This is nearly always an intuitive matter of giving conflicting virtues various weightings, guided by little better than “how it feels” pragmatically.

For example, a holist would judge Freud’s theory by asking how much it seems to explain—how well it meshes with evolutionary theory, with other philosophical ideas about agency, with what ordinary people can see for themselves of undercurrents and tensions in family life, with the various insights that art can give us about ourselves, and much else besides.

A telling difference between foundatonalists and holists is in their respective attitudes to specialist or “expert opinion” (by which I don’t mean the pragmatic know-how of a mechanic, but rather narrow theoretical claims made in advanced disciplines). The foundationalist tends to trust expert opinions, because he sees them as the product of skilled minds’ rare ability to trace specialised claims back to their specialised foundations, rather as an actuary can draw specific conclusions about a company’s finances from its specific account books.

The holist tends to distrust expert opinions. He will remind us that we can more reliably form opinions about the simple, familiar, observable, concrete and everyday than we can about the complicated, unfamiliar, unobservable, abstract or unusual. Most importantly, the holist is aware that claims made in specialised disciplines are typically hypotheses rather than the conclusions of arguments. No “data” implies them. If anything, it’s the other way round: hypotheses imply as-yet unseen events that observation can later confirm or deny. To the holist, the broad experience of a reasonably well-educated layman is better than the specialised training of an expert.

Holism has been around for well over a century. It has some well-known academic proponents such as Quine and Davidson. Yet foundationalism remains the default position among academics. Most of them despise hypotheses—mere “guessing”, as many would put it—and encourage their students to “provide arguments” instead of explaining why this or that hypothesis explains or predicts things better than its rivals. I think this is a tragedy.

Why we can’t model climate

Hypotheses represent their subject matter by being true or false of that subject matter. Like most sorts of representation, this does not involve resemblance. But models are different: they do represent their subject matter by resembling it in some relevant way. For example, a model airplane might resemble a real airplane by having similar shape and colors, even though their sizes are different. Or it might mimic the real airplane’s flying behavior.

To keep things simple, I’ll talk about respective “behaviors” (of model and subject matter) over time, but bear it in mind that this mimicry can be along any dimension: for example, a Fourier series might model a function along the x-axis rather than over time. Here’s the important point: I think the behavior of both model and whatever it represents must be “lawlike” in the roughly the same way. (I include statistical laws here, by the way.) In respect of the relevant resemblance between them, it’s essential that “nature continues uniformly the same”.

I’ve used words associated with “Hume’s problem of induction”. Popper famously rejected all (enumerative) induction as problematic. I think that went far too far. As far as I’m concerned, induction is often fine, we just need to reflect in a piecemeal way on circumstances in which induction is reliable, and circumstances in which it isn’t. It’s reliable when it traces law-like connections in the real world (such as “these emeralds are green, so all emeralds are green”). It isn’t reliable when it doesn’t.

It seems to me that we have good reasons for thinking the climate doesn’t behave in a lawlike way, or at least not in any way useful for modelling in climate science. It may be deterministic, but that’s not the same as being predictable or capable of being modelled. Over time, or in response to various changes in initial conditions, the climate is very complicated and multiply chaotic. It seems to me that additional computing power will bring diminishing returns, so that attempts to model the climate will meet a “ceiling” like that of weather forecasting. We may get a bit better, but we probably can’t get all that much better. To put it bluntly, I think it’s a waste of time, brains and money.

Formal versus informal implication

I want to compare and contrast two sorts of implication — and I want to suggest that our understanding of beliefs and logic is badly affected when we confuse them, as we often do. In the hope of making things a little clearer, I propose to use the following symbolism:  (written in Mistral font) stands for the belief that P, P (in italics) stands for the linguistic sentence expressing the same content P, and stands for the fact that P, which of course only exists if P is true. (I gave “earthy” colours because it’s “in the world”, geddit? Also the looks a bit like an octopus, i.e. a real thing in the world.)

For illustration, if P is the sentence ‘Snow is white’, is the belief that snow is white, and is the fact of snow’s being white — a very simple sort of fact that might be represented by a Venn diagram like this:

That silly diagram is intended as no more than a reminder that although we are using a letter for a mental state (belief ) which is true or false, and a letter for a linguistic utterance (sentence P) which is true or false in the same circumstances, in the third usage (fact ) a letter stands for those circumstances themselves — something that is neither true nor false. Now it may sound strange to say that a fact isn’t true — facts are “true by definition”, aren’t they? Well, a fact is what makes a true sentence or true belief true, so wherever there’s a fact there’s a truth. In a loose colloquial sense we might refer to truths as facts. But in the current philosophical sense, a fact is strictly a state of affairs corresponding to a truth.

So understood, facts cannot imply anything, being themselves neither true nor false. But their linguistic or mental counterparts can, and this is what I want to examine here. It seems to me that confusion between facts, sentences and beliefs has generated much misunderstanding about the nature of thought itself. I hope to disentangle a little of this confusion here, and in doing so I hope to persuade you that formal logic is much less useful than is widely supposed as a tool of critical thinking.

Although facts can’t imply one another, linguistic sentences often do. For example, what are we to make of the claim that P implies Q?

If it is true, it describes a fact of some sort of lawlike connection — formal, causal, categorical, or whatever — between two possible facts and . I say “possible” facts because the implication can hold at the same time as the individual sentences P and Q it connects are not true. What matters is the connection between the sentences rather than their truth-values. For that reason, material conditionals of elementary logic (whose truth-value depends simply on the truth-values of what they connect) don’t capture this sort of implication. The conditionals we use for that purpose have to be understood as counterfactual conditionals, or as having some sort of subjunctive mood, so that they can be true or false regardless of the truth or falsity of their component parts.

Just as the sentence P can both describe a purported fact and stand for the belief , the claim that P implies Q can both describe a purported fact and stand for a belief. The nature of this fact and of this belief have seemed a bit of a mystery, to me at any rate in the past. I now think that mystery is largely the product of confusion between formal and informal implication. Apologies if this is no mystery to you.

Formal implication

As a model of implication, most of us take the case we are most familiar with: implication in formal logic, where the premises of a valid deductive argument imply the conclusion. When I say the implication here is formal, I mean that the work is done by language, and thought follows. That is, relations between sentences guide the formation of beliefs.

When conditionals that express such implications are true, they are true by virtue of the fact that one sentence can indeed be derived from another sentence via rules of inference that enable the derivation.

Deriving one sentence from another is a bit like building a structure out of Lego bricks. In this analogy, our rule of inference might be “every new brick must engage at least half of the interlocking pins of the bricks underneath”. When we begin, we might have no clear idea whether a given point in space can be reached given our starting-point. But once we do reach it (if we do), we can believe that it is legitimately reachable, given that starting-point and the rules of inference. Or at least, we can “accept” it as true, because we “accept” the rules of inference simply by using them.

With formal implication, the fact that corresponds to a true claim that P implies Q is a “linguistic” fact, embodied by the actual derivability of Q from P. The belief that corresponds to a claim that P implies Q (or sort-of belief, if all we do is “accept” it as true) is about derivability in language.

Informal implication

With formal implication, the work is done by language and thought follows. But with informal implication it’s the other way around: the work is done by thought and language follows. Actually, if thought is working as it should, this one-thing-following-another goes deeper, all the way to facts. The world has some lawlike features, and the thoughts of animals reflect them — in other words, animals have true beliefs about lawlike facts. Later, we human animals try to express those thoughts using language. Here real-world relations guide the formation of beliefs, which in turn guide the formation of sentences.

These sentences can be misleadingly ambiguous. A sentence like ‘P implies Q’ can be read in three distinct ways. It can say something about the lawlike connections in the world, i.e. facts about how and are related; or it can say something about the way sentences P and Q are related; or it can say something about how beliefs and are related. This ambiguity is compounded by the fact that a sort of meta-level “conditional” corresponds to each of these types of relation, and the situation is made still worse by our inclination to take formal implication as our model of implication in general.

It seems to me that the way to avoid getting lost here is to constantly remind ourselves that the primary link is between things in the world where lawlike connenctions exist: “what goes up must come down”, “if it has feathers, it’s a bird”, etc. Thought captures these lawlike connections by forming belefs that stand or fall together in a systematic way. If and are related in a lawlike way, a mind captures that meta-level fact by being disposed to adopt the belief whenever it adopts the belief , and to abandon the belief  whenever it abandons the belief . Given the larger belief system to which the pair may or may not belong, they’re “stuck together” like the ends of a Band-Aid:

The system as a whole has the property that whenever gets added to it, gets added too, and whenever gets stripped away from the system, gets stripped away too, like a Band-Aid whose adhesive parts are put on or taken off (in reverse order).

If we can be said to have a “conditional belief” corresponding to this sort of implication, it amounts to little more than belief that a lawlike connection exists between and . This meta-level “conditional belief” is embodied in the way and stand together or fall together in the system. Even if such a belief is false — as it would if there were in fact no lawlike connection between and — that distinctive linkage of beliefs and in the system is all it amounts to. When we come to capture it in language, we may use arrows or similar symbols to indicate a non-symmetrical linkage of P and Q, but let’s be careful not to think of such informal links as perfectly mirroring formal links.

I hope you agree that the Band-Aid analogy goes too far in that it contains one unnecessary detail that ought to be omitted from our understanding of informal implication. That detail is the “bridge” between the adhesive parts, with its supposed “hidden mechanism” enabling an inference from P to Q. I think we are inclined to imagine such a mechanism exists because we are so used to taking formal implication as our model, and we have a tendency to assume something akin to interlocking Lego bricks are needed to “bridge the gap” between and . A better analogy perhaps would be a Band-Aid with the non-adhesive part removed:

What does it all mean?

The assumption that formal and informal implication are closely parallel misleads us about the nature of thought. It promotes the idea that thinking is a matter of “cogwheels and logic” rather than many direct acts of recognition by a richly-interconnected belief system, often of quite abstract things and states of affairs.

People who praise or actively promote logic as an aid to critical thinking routinely assume that beliefs work like discrete sentences in formal implication. That is, they assume beliefs have clear contents with logical consequences which are waiting to be explored. Well, as I’ve said several times now, in formal implication, language does guide thought. Beliefs correspond to sentences which are discrete because of their distinct form. One sentence leads to another thanks to the rules of inference, and beliefs follow their linguistic counterparts. The beliefs that are so led are themselves discrete because they are so closely associated with discrete sentences. Their contents determine the inferential connections between them. But most beliefs aren’t like that at all. Their content isn’t determined by prior association with discrete sentences whose form precisely determines their content. Rather, their content is attributed via interpretation, which is an ongoing affair and, well, a matter of interpretation. That interpretation involves “working our way into the system as a whole”, taking account of the inferences an agent draws and attributing whichever mental content best reflects his inferential behaviour. If someone behaves as if he is committed to lawlike connections in the real world, we attribute beliefs whose contents are appropriate to commitment to those lawlike connections. Here, inferential connections between beliefs determine their content rather than vice versa.

As far as I can see, this limits the usefulness and scope of logic. It’s useful in the academic study of logic, obviously, but outside of that field, only the most elementary applications are of much use, even in formal disciplines like computer science and mathematics. I agree that it’s useful to be aware of informal fallacies and to try to avoid them. But beyond that, the power of logic has been over-inflated by the assumption that beliefs are like “slips of paper in the head with sentences written on them”, and the assumption that thinking proceeds by drawing out their consequences — by examining what they formally imply.

What is “denial”?

When we say someone is “in denial”, we mean that they reject something obvious — something so obvious that their rejection of it amounts to a sort of pathology. For example, in the movie Psycho, Norman Bates interacts with the skeletal remains of his obviously dead mother as if she were still alive. This is not a sign of good mental health.

Although “deniers” deny facts, they usually do so for “emotional” reasons. They want something to be true so much that they pretend that some other things are false. Dolly Parton uses this idea effectively in The Grass is Blue:

There’s snow in the tropics
There’s ice on the sun
It’s hot in the Arctic
And crying is fun
And I’m happy now
And I’m so glad we’re through
And the sky is all green
And the grass is all blue

It’s vital to see that denial is not the mere rejection of facts — it’s the rejection of obvious facts, things that almost everyone can see easily with their own eyes.

We might say that denial is rejection of “observational” facts rather than “theoretical” facts. Fine, but all observation is “theory laden” — in other words, observations have to be interpreted, there’s no such thing as “raw data”, there’s no sharp distinction between observation and theory, and so on.

There is a gradient here, between facts that can be directly checked by simply opening our eyes and looking, and facts that are more abstract — facts that leave more room for doubt, that can be interpreted in several different ways, that depend on theoretical commitments that are not universally shared.

Some facts lie near enough to the observational end of the gradient to be counted as “almost observational” themselves. For example, we can’t quite see directly that the Earth is round. But nowadays we’re familiar with photographs taken from space of a round Earth, and most of us have watched ships slowly disappearing over the horizon, and so on. When we fly long distances, we adjust our watches in the perfectly reliable expectation that we will land in a different sector on the surface of a rotating sphere. Nowadays, a person who insists the Earth is not round is denying something very close to “obvious”.

Words like ‘denial’ can serve a useful purpose. But they are abused when applied to the rejection of claims that are not obvious. In that situation, their use amounts to an appeal to authority rather than an appeal to observation. The theoreticians whose opinions are rejected are supposedly so authoritative that it takes a sort of mental pathology to disagree with them.

I can’t think of a less sceptical or less scientific attitude than one that demands obedience to the authorities by “taking their word for it”. Heretics were tortured and killed by people who justified their sadism by saying their victims were suffering from a sort of pathology — one whose “cure” need not involve the giving of reasons.

Sometimes I have to restrain myself from using words like ‘denial’ for Creationists who reject the theory of evolution. But then I remind myself that the theory of evolution isn’t obvious — if it were, it wouldn’t have taken someone of Darwin’s stature to provide a satisfactory account of it. People who reject evolutionary theory are sceptical about something I believe in, but they can’t reasonably be called “deniers”. This also applies to other types of scepticism.

Demarcation and the magic of science

Is psychology a science? Are the methods used by climate scientists more pseudo-scientific than genuinely scientific? — When we ask questions like these, the issue is “demarcation”: How can we distinguish or demarcate genuine science from non-science?

The word ‘science’ has become such a warm — almost religious — term of approval that practically everyone nowadays is eager to call whatever they do “science”. So hearing the word ‘science’ in a description  — especially self-description — of what people do isn’t a reliable indicator that what they do actually is science. When you hear the word ‘science’, beware: there are impostors about.

Followers of Popper say that the mark of science is “falsifiability”: if there is no way a hypothesis can be shown to be false, then it’s not scientific. I hope it’s clear why. We all accept that hypotheses can’t be conclusively verified: a hypothesis is just a guess, and as a guess it can never be proved true the way theorems in mathematics can. But of course a good scientific hypothesis has more going for it than mere guesswork. It “sticks its neck out” by saying something about the real world. If it gets it wrong, the hope is that it will be exposed as false; and if it hasn’t been so exposed, at least not so far, that counts in its favour. But in order to have that count in its favour, it has to be able to stick its neck out. If it can’t do that, followers of Popper think, it can’t be regarded as a scientific hypothesis.

I think these followers of Popper are on to something important, but they’re not quite there yet. Why not? — Because of “holism”, hypotheses can never be decisively falsified either. Hypotheses are held by individual people, along with all the other stuff they believe, which are also hypotheses. Any particular hypothesis which is subject to testing — and therefore potential falsification — is tested alongside innumerable other hypotheses. Together, they imply something that can be observed, and that something either is observed or else it isn’t observed — it’s a “1 or 0 outcome”, if you like.

Followers of Popper want to say that if that something isn’t observed as predicted, then the hypothesis that yielded the prediction is falsified. So even though we have to reject it, it was a scientific hypothesis to begin with; if we had been lucky enough not to have to reject it, so much the better. But things are not as simple is this. Any of the hypotheses, plural, that went into generating the prediction might be singled out as the culprit. With a bit of judicious weeding and planting, we can tend our “garden” so as to keep what we want. For example, my belief in the steady-state theory of the universe yields the prediction that no red shift should be observed in distant astronomical objects. When it is in fact observed, contrary to my prediction, am I obliged to declare my hypothesis falsified? — No, I am not. I can simply make up a new hypothesis, to the effect that light gets “tired” when it travels for very long periods of time and loses energy. This loss of energy manifests itself as decreasing frequency.

In fact, any favoured hypothesis can be protected from the threat of unfavourable observations like that. So any hypothesis can be held in such a way as to make it practically unfalsifiable. So we can’t appeal to the falsifiability or otherwise of a hypothesis as the mark or standard of its being genuinely “scientific”.

It doesn’t follow that no such standard is possible. But instead of focusing on the falsifiability of hypotheses, we should consider instead the way hypotheses are held. If peripheral “excuses” are habitually made so that a hypothesis is held with such tenacity that it is in effect unfalsifiable, what is held is no longer a scientific hypothesis but an ideology. Hypotheses that are made up for the sole purpose of protecting another, favoured hypothesis are called ad hoc hypotheses. Anyone who cares about science should be constantly on the look-out for too great a willingness to make up ad hoc hypotheses. Eagerness of that sort is the mark of ideology rather than science. People in the grip of an ideology can believe almost anything they like, as long as they are prepared to bend over backwards far enough to accommodate their central hypothesis — the one they like.

Followers of Popper who emphasise falsifiability aren’t far wrong, though. What they get right, I think, is the importance of passing tests — by which is meant the honest prediction of otherwise unexpected results, followed by actual observation of the unexpected results. It emphatically does not mean the mere fitting of hypotheses or models to data that have already been gathered and don’t give rise to any sense of surprise.

For a hypothesis to be tested, it must have predictive power. The hypothesis purports to describe things that can’t be observed directly, but it implies things that can be observed, which wouldn’t been anticipated but for the hypothesis yielding its prediction, and so which without it would seem surprising.

Perhaps even more important than predictive power is explanatory power. A hypothesis that has great explanatory power enjoys a logically similar position to one that has great predictive power, with “bafflement” taking the place of “surprise”. In both cases, something that baffles us (in the case of explanation) or would otherwise surprise us (in the case of prediction) is “newly encompassed”. In both cases, what is newly encompassed is implied by a description of some hidden aspect of reality. Such “drawing back of the curtain on reality” is the magic of science, really, and its ability to do that rightly commands respect and often deserves belief. Anything that doesn’t succeed at that feat — whatever masquerades as science but has no real predictive or explanatory power — doesn’t have a legitimate claim to be believed.

Knowledge and hope

The traditional understanding of knowledge as “justified true belief” is internalist. That is to say, for a true belief to count as an item of knowledge, it must satisfy a third condition of being “justified”, which is a state of mind. Justification is traditionally understood as being internal to the mind.

More recent “naturalized” epistemology is externalist. That is to say, for a true belief to count as an item of knowledge, it must satisfy a third condition of being connected in a special way to the real world “outside” the mind. There are various ways of characterizing this special connection: it must be reliable, it must “track” truth, it must be sustained by a law-like process, the belief in question must be non-accidentally true. I’ll use the word ‘reliable’. But whichever words we use for it, the connection reaches outside the mind, and so part of it is external to the mind.

I think these two ways of thinking about knowledge correspond to “is” and “ought” in an interesting way.

The internalist is looking for justification — and (in theory at least) he can check whether a belief is justified by examining the way it is linked to his other beliefs through relations of implication. Foundationalists think justified beliefs are implied by “basic” beliefs; coherentists think there is a network of mutual implication. Either way, these other beliefs are in the mind, and so they can potentially be put “before the mind” for inspection. According to this understanding of knowledge, we can have assurance that we know. In fact the main thrust of traditional epistemology is “doctrinal”: it’s aimed at assuring the radical sceptic that we do in fact have knowledge. We know something when a belief is justified, and it is justified or not as a matter of fact — an “is”.

Instead of seeking justification, the externalist wants reliability. And he isn’t “looking” for reliability so much as “hoping” for it. He can’t directly check whether the connection he hopes is reliable actually is reliable, because one end of it lies outside his mind. According to this understanding of knowledge, we can’t have an internal assurance that we know, because some aspects of knowledge are aspirational. We aspire to the goal of having reliably true beliefs. To the potential knower, such aspirations are better expressed by the word ‘ought’ than ‘is’. None of the beliefs he already has — as a matter of fact — can imply that these aspirations are met, because “oughts” cannot follow from “is”s alone.

This aspirational aspect of knowledge might be likened to “the object of the game” for a chess-player. The would-be knower and the chess player have goals: to have reliably true beliefs, and to get the opponent’s king into checkmate, respectively. These goals are the object of “oughts”: the would-be knower’s beliefs ought to be reliably true, and the player’s moves ought to bring the goal of checkmate closer. In both cases, the “ought” guides behavior in a nontrivial way.

Of course neither of these is a moral “ought”. Proper epistemic practice obliges us rationally rather than morally to aim for reliably true beliefs. Chess players’ implicit acceptance of the rules of chess — specifically, the rule that specifies the object of the game — obliges them to aim for checkmate. Someone who gets bored and plays “suicidal” chess to end a game quickly isn’t guilty of a moral failing, he’s just not playing chess properly.

The chess player has to interact with his opponent: he can’t focus on his own moves to the exclusion of his opponent’s moves. Analogously, the potential knower has to interact with the world: he can’t focus on his own beliefs to the exclusion of “answers” the world gives in response to the “questions” he “asks” it. In practice, this “questioning” of the world is the testing of hypotheses. To form new beliefs or new theories by simply building on what one already believes — including “data” — is like playing chess without paying attention to your opponent’s moves. In effect, this is what Bayesians do when they make epistemic decisons on the basis of “degrees of belief”. (I shall have more to say about Bayes’ Theorem in a forthcoming blog post.)

The “object of the game” of empirical knowledge is to transcend internal assurances and aim for reliably true beliefs — an external matter, which usually involves the testing of hypotheses.

I mentioned above that traditionally, epistemology was internalist. The tradition continues to this day, and it affects the way epistemology is taught in university philosophy courses: they tend to begin with Descartes’ Meditations, and typically don’t move very far beyond that. They tend to treat Gettier problems as a mere curiosity. Internalism can also affect the way scientists do science. Some sciences — especially those that appeal to “overwhelming evidence” to counter scepticism — use “internalist” methods of shaping models to fit “data” that may as well have been gathered beforehand. In effect, this is to eschew testing in favor of an internalist sense of assurance.

Proper science and knowledge are aimed at truth, not at assurance. Their aspirational aspects entail that testing is essential. To use another analogy: a miser might get assurance from counting coins he already owns, but he can’t make money unless he runs the risk of losing money by investing it in outside projects. In pursuit of truth rather than profit, science too must “cast its net” beyond “data” already gathered.

What is induction?

I use the word ‘induction’ a lot. But the word can be a bit slippery. Hume is celebrated for his “problem of induction”, and he was indeed concerned with what we nowadays call “induction”. But Hume himself didn’t use the word ‘induction’ for what he had his problem with.

What do I mean when I use the word ‘induction’?

A classic example of induction is the inference from “the swans I’ve seen so far have been white” to “all swans are white”. This inference assumes that “nature continues uniformly the same” (as Hume put it), so that as-yet unseen swans are similar in the relevant way to the swans I’ve seen already. Nowadays, we would put it in terms of scientific laws, which describe the sort of reliably universal regularities Hume had in mind.

Putting it in terms of scientific laws conveniently illustrates why induction is often unreliable. Swans are not universally white — some are black. Because swans’ color is not regular in the lawlike way required — in other words because nature does not continue uniformly the same from one swan to the next as far as color is concerned — the inference above is unreliable.

Because inferences like that involve generalization, induction is sometimes characterized as inference “from the particular to the general”, but that is actually a rather poor way of characterizing it. Words like ‘all’ can appear in the “premises” of inductive inferences as well as being absent from their “conclusions”. For example, consider the inference from “all of the electrons observed so far have had a charge of minus one” to “the next electron we observe will have a charge of minus one”, or to “any electron has a charge of minus one”, or to “the electron is a subatomic particle with charge minus one”. Superficially, these look like inferences from greater to lesser generality.

The assumption of lawlike regularity

At the risk of belaboring the point, it isn’t always obvious when induction is involved in an inference. For an induction to be reliable, it has to be underwritten by a lawlike regularity in the real world, whether or not we are aware of it — it’s often a matter of sheer luck. But even when an induction is unreliable because there is no real lawlike regularity, it still counts as a case of induction if it assumes that there is.

So, if we’re wondering whether induction is involved in an inference, it’s probably safer to look for an assumption of lawlike regularity than to look for words that typically signal generalization or extrapolation.

Sometimes the required assumption of lawlike regularity isn’t all that obvious. Suppose we take a sample of people and find that 10% of them have red hair. We then use statistical extrapolation — an application of induction — to claim that 10% of the entire population of the world has red hair. For this to be any better than a shot in the dark, the sample must be representative of the world’s population, at least in respect of the proportion who have red hair. It isn’t enough for the sample to reflect this feature of the larger population by accident — it must do so systematically, so that the generalization from sample to entire population is non-accidental. (Laws are sometimes characterized as “non-accidental generalizations”.)

Statistical extrapolation

There is more than one way an induction can go wrong. In the current example, if the proportion attributed to the entire population is too precise — for example, if we claim that exactly 10.0001% have red hair because that is the exact proportion in the sample — the detail is overly-fine-grained. Detail of that sort — not underwritten by lawlike regularity — is merely artifactual. That is, it is a misleading by-product of our own methodology rather than a feature of the real world. It is analogous to seeing one of your own eyelashes reflected in the lens of a microscope.

Skillful sampling is vital for reliable statistical extrapolation. A sample should be representative of the population as a whole, and that takes skill. Some rigorous-looking statistical methods are meant to estimate how representative samples are, but too often, these methods themselves rely on induction, by extrapolating the variability of samples to the entire population. To my mind, these statistical methods are the products of a quest for assurance rather than a quest for truth. A better idea is to test sampling techniques. For example, the sampling techniques of voter popularity polls before elections are tested by actual election results. Nate Silver accepted credit for predicting the most recent US election results, but more credit is due to the people who were able to get such representative samples of voters.

Non-numerical examples of induction are tricky enough. Things get worse when numbers are involved. Perhaps worst of all is the completely spurious idea that we can have a numerical measure of “how much the conclusion of an induction deserves to be believed”, usually assumed to be some arithmetical function of “the number of instances an induction is based on”.

Induction versus guessing and testing

I hope it is clear that there are several “problems of induction”. It is a distinctly problematic form of reasoning, mostly because it apes deduction. It’s what people come up with when they try to imagine what an argument would look like if it could deliver “empirical” conclusions that do more than just re-arrange ideas expressed in premises. Behind it lies the malign assumption that evidence consists of being shown to be the conclusion of an argument. (I beseech you, dear reader, to reject this assumption!) When combined with popular ideas about science being “based on observation”, induction can acquire a hallowed status — a status it doesn’t deserve. It’s not the “complementary, alternative form of reasoning to deduction”, and it doesn’t appear much in any of the respectable sciences.

Rather than relying on induction, science is mostly a matter of guessing followed by testing. Rather than starting off with observations and proceeding by extrapolating from them in a mechanical way, science starts off with explanatory hypotheses and proceeds by devising tests for them — feats that call for imagination, creativity, and cunning. Rather than seeking assurances in the form of inductive arguments, science seeks truth by casting a wider net to check for falsity.

Karl Popper recognized the centrality to science of making “bold conjectures” that stand despite the possibility of their refutation. He rejected induction altogether as unscientific. But I think he went way too far here. I also think it was ridiculous to claim as he did that a theory’s passing a test doesn’t give us any reason to think the theory is true.

I would argue that passing tests usually gives us good reason for thinking theories are true. I agree with Quine that induction is a special case of the hypothetico-deductive method (of guessing and testing just mentioned) and that a broader understanding of the latter helps to explain why induction is sometimes reliable.

One of the main attractions of induction is it removes the sense of guesswork from empirical reasoning. Instead of “having a stab” at things, induction “frog-marches” us from observations to conclusion in what might seem a reassuringly “inescapable” way. It has a mechanical feel, like deduction. Let us not be too easily seduced by these attractive features!

Example: modeling a compound pendulum

I’ll try to illustrate why with an example — an artificial one, concocted specially to show how induction can fail to deliver the goods. Image a compound pendulum of two rigid parts that behaves chaotically — that is, its configuration critically depends on initial conditions, so that over time its movements are practically unpredictable. And because they are practically unpredictable, they can’t be modeled in a computer. (Not in practice, anyway.) I can make computer versions of compound pendulums that behave just like real compound pendulums behave, but I can’t make a single computer model that mimics the behavior of a given actual compound pendulum.

But suppose I don’t know that. Suppose I set out to create a computer model of this very apparatus, encouraged by the thought that each of its two moving parts behaves in a well-understood, completely lawlike way. As “inputs” I obviously use my knowledge of the simple, elegant laws that describe their movements. But I also know that more than that’s involved: I need to experiment a bit with different lengths of the rigid bars that make such a pendulum, different masses, different centers of gravity, different moments of inertia, and so on.

Every time I run my model, adjusting one or other of these input variables, I compare the progress of my computer model and the configuration over time of the actual compound pendulum, to see where they begin to diverge. Although I will make progress at first, there will come a point in every single run at which the divergence becomes significant — too large for me to count my model as a model of that actual, given compound pendulum.

Now, if I were asked to defend my computer modeling, I might say that I have been working with numerous models, and that they all have been rigorously “tested”. The models that have failed such a “test” have been diligently thrown out, I might claim, and I have learned from my errors by making new models with better initial values for the relevant variables.

But I would just be kidding myself. None of these models is a “bold conjecture”, nor is any of the “tests” anything like the real test of a hypothesis. A real test involves cunning on the part of the experimenter — and sweating palms on the part of the theorist whose reputation is on the line. There is the possibility of failure rather than the expectation of some further tinkering. What we have here instead is the mechanical adjustment of numerical values to fit some “data” that may as well have been gathered beforehand. Rather than individual models being tested, the entire process of model-generation is being adjusted to fit a series of data-points. In effect, it is the fitting of a curve through them. This fitting is guided by the assumption that future motion of the pendulum will “continue uniformly the same” in matching the progress of the computer model. It assumes there is a lawlike connection between them, or at least that one can be found. (It can’t.)

This is induction. It has many of the problems that attend induction. There is no way around the fatal mismatch between model and reality that I purposely built into the example.

A compound pendulum is a very simple apparatus, whose behavior can’t be captured by induction. I leave it as an exercise how much we can hope that anyone could model more complicated items such as the climate of an entire planet subject to many, many more variables.

Proof and science

In many human endeavors, evidence takes the form of proof, or something structurally equivalent to proof. From pure mathematics to everyday discourse, something presently open to question is shown to be derivable via rules of inference from some other things that are presently not open to question. In pure mathematics, theorems are derived from axioms. In everyday discourse, we try to persuade one another of claims by showing that they follow from what we already believe or agree on.

In these endeavors, we start off with stuff we already consider to be “in the bag” — axioms, shared beliefs, whatever — and we go through a rule-guided process that ends with our adding something to our “bag” of claims that we consider “virtually assured or secured, as good as in one’s possession”. (This quotation is taken from from the OED’s entry on ‘in the bag’.)

Our most familiar concepts of knowledge and evidence have been shaped by the idea that some beliefs can be decisively considered beyond question, at least for the time being. Knowledge is supposed to consist of truths that are assured, and evidence is supposed to consist of what can give us the assurance.

There is one human endeavor that isn’t like that at all: science. The object of the game of science is not assurance but truth — usually truths about aspects of reality that cannot be observed directly.

I think scientific knowledge exists, but we have to adjust our concept of knowledge to accommodate it. Scientific knowledge is far from certain, because scientific theories are about unfamiliar and often quite strange things that cannot be seen directly. Our best theories about them might be only approximately true. These theories are representations of reality, but we are often unclear how to interpret them. Our understanding of how they represent reality is not always complete.

More to the point, evidence in science cannot be understood as proof, or as having anything like the structure of proof. It does not consist of deriving something via rules of inference from something else that we are already assured of.

Rather, evidence in science consists of piecemeal peripheral indications that we have got it right. The most important such indications are observational tests. Because most scientific theories describe things that can’t be observed directly, we can’t check directly to see whether they are true. Instead, a theory implies things that can be observed directly, and if direct observation confirms what the theory predicts, the theory is corroborated. This is nothing nothing like following a rule of inference, and it delivers nothing like assurance of the sort that prompts us to say “it’s in the bag”.

Other peripheral indications of a theory’s truth are more “aesthetic”. It explains a lot. It seems simple, modest, general, etc. It has the ring of truth. And so on.

Good science is conducted with a clear awareness that scientific evidence has nothing like the structure of proof. Alas — bad science isn’t conducted with that awareness.

Why be sceptical about climate science?

Those who believe climate change is a real problem should present their case with more epistemic responsibility, by which I mean fewer appeals to authority. “Appeal to authority” is an informal fallacy, which I’m sorry to say is exemplified by this article, both in its appeal to the expertise of climate scientists, and to the non-expertise of their critics.

To understand why many genuine sceptics are doubtful about climate science, we have to ask some simple but deep questions about what empirical evidence is, what makes a good scientific theory worth believing, and so on. To talk about “overwhelming evidence” in an apparent vacuum of answers to those questions is not good enough.

A quick answer to the question of what makes a scientific theory worth believing is that is has great explanatory or predictive power — or best of all, both. Examples: evolutionary theory has remarkable explanatory power (which is why Daniel Dennett called it “universal acid”) but rather limited ability to predict the future. By contrast, quantum theory throws up more puzzles than it solves, so its explanatory power is ambiguous at best, but it more than makes up for that shortcoming with its extraordinary predictive power. With both evolutionary theory and quantum theory, we can be very confident that we are “on to something”, even though we may be wrong about the finer details, or have real conceptual difficulties with the currently-available interpretations of the formalisms.

A not-so-quick answer to the same question adds why predictive power is important. Only by yielding checkable predictions can a scientific theory be tested. In testing, the theory (in conjunction with numerous other theories and assumptions) implies something that can be observed. If that something actually is observed, the theory is “corroborated”, which is good news for the theory. And if it isn’t, that’s not such good news for the theory.

Note that this corroboration by observation is nothing like its “being implied by” observation. Observations can’t imply theories that describe unobservable things — in other words, scientific theories aren’t “based on” observations. Rather, our reason for thinking a scientific theory is true is: it would be an odd “coincidence” for it to be able to pass tests yet still be substantially false. The more varied and cunningly devised such tests can get, and the more of an “amazing coincidence” the theory’s falsity would become if it still passes them, the more confident we can be that it actually is true, or at least approximately true.

Evidence for a scientific theory is a special case of evidence for any belief — it consists of other beliefs. As the believer engages with the real world, together these beliefs in effect “test” the belief in question, and if it “passes” the “test” it probably actually corresponds to (i.e. its content accurately describes) the real world. It “works” because it’s probably true. In slogan form, evidence consists of “everything seeming to hang together well”. A good scientific theory is consistent with observations that would expose it as false if it were false — in other words it “stuck its neck out and survived” a trial by observation. More generally, a good belief meshes smoothly with our other beliefs. So we have good evidence for any scientific theory which has passed tests, and which exhibits various “theoretical virtues” such as simplicity, modesty, etc. (i.e. the tell-tale marks of its meshing smoothly with our other beliefs).

Unfortunately, there is a long philosophical tradition of supposing that evidence consists of something else, namely, a theory or belief’s “being based on” something more certain, usually what is fancifully thought of as “raw data” of sense experience. The ubiquity of that supposition reveals itself in everyday language, in which words like ‘grounds’, ‘foundation’ and ‘basis’ are practically synonymous with ‘evidence’. Such words are indeed appropriate in mathematics, where theorems are genuinely based on (i.e. implied by) axioms, but in everyday discourse they are deeply misleading.

That tradition was effectively discredited in the twentieth century by mainstream philosophers of science and a few exceptional figures such as Wittgenstein and WVO Quine. But even in the absence of sophisticated philosophical analysis, it should be obvious from animal behaviour that many animals have knowledge — and their knowledge has nothing to do with their beliefs being “based on data”. Instead it depends on their beliefs being sustained by reliable processes connecting brain and physical surroundings in the real world.

Despite all that, the assumption that evidence in general and scientific evidence in particular is a matter of being “based on data” lives on at the periphery of science — in disciplines like psychology, where routine appeals are made to “studies” which work as starting-points for the genesis of theory rather than tests for the justification of theory. The guiding assumption of these questionable disciplines is that empirical evidence typically takes the form of extrapolation from a sample rather than passing a test. We don’t find that assumption in mainstream sciences like physics, chemistry, biology or in any branch of engineering. But we do find it in climate science.

Although some aspects of climate science (such as the greenhouse effect) belong to mainstream physics and borrow the legitimacy of its methodology, a large part of climate science involves the construction of computer models that are intended to mimic the earth’s climate. They are shaped to fit the “data” of past climate, “data” consisting of both actual measurements and — alarm bells should ring here — “proxy measurements” derived (with a high dependence on theory) from tree-rings, ice-cores, lake-beds, etc.

What normally gives us reason to believe that good scientific theories are true does not give us reason to believe that these computer models are accurate. Why? — We noted above that corroboration by observation was not at all the same as being “based on data”. The role of observation is completely different between the two. In the former, a theory “sticks its neck out and survives”, but in the latter, it is deliberately shaped “after the fact” so that it cannot fail to fit with observation. It is often claimed that these models are “tested”, but it is not testing of the sort that gives us reason to believe a scientific theory, i.e. a test which if passed suggests truth and which if failed suggests falsity. Instead, a model is compared with data which were gathered beforehand, and which the model was later adjusted to fit. In the former case, we have a reason to think we are “on to something”; in the latter, we have nothing better than data-fitting. The reasoning involved is not that of the hypothetico-deductive method of guessing and testing, but induction: the most problematic form of reasoning.

The standard scientific method of guessing followed by testing is called “hypothetico-deductive” because observational predictions are drawn from hypotheses using deductive logic (often mathematics). If actual observation confirms the predictions of a hypothesis, then we have a reason to think the hypothesis is true. This pattern is reversed in induction. It starts off with observations and then extrapolates to arrive at a hypothesis or model. Because this is essentially generalisation, the hypotheses or models so arrived at can only describe or mimic the sort of things that were observed directly in the first place. In other words, induction cannot take us beyond the merely observable. Contrast that with mainstream science, whose hypotheses typically describe unobservable and often very exotic things such as subatomic particles and force fields.

It must be admitted that induction does sometimes give us reason to believe its end-product generalisations. It can be a reliable way of forming beliefs, as long as there is a lawlike connection between the property observed in a sample, and membership of the larger class from which the sample was taken. For example, being green is an essential property of emeralds. We can reliably use induction to infer that all emeralds are green from examining a few of them and observing that they are green. But being white is not an essential property of swans, so we can’t reliably use induction to infer that all swans are white from examining a few of them and observing that they are white.

Simple induction can be reliable when applied to some simple lawlike phenomena. But it won’t be reliable when applied to complicated, patternless, un-lawlike phenomena with non-essential properties. No examination of Beethoven’s nine symphonies, however rigorous, will enable us to extrapolate a tenth symphony. The Earth’s climate is huge, very complicated, and chaotic. The human eye can discern few patterns it its routine variability over the decades. This is not the sort of thing induction can reliably extrapolate from. But suppose we give climate science the benefit of the doubt and grant that it can. Then we would have a reason to think the models are accurate as long as they yield some observable predictions that turn out to be true. But apparently they can’t. Inasmuch as these models have been genuinely “tested” at all, they get it wrong — they routinely fail the “tests” because their checkable predictions have been checked and found wanting. And as long as the climate continues not to warm, as it has not been doing now for about fifteen years, they’re getting wronger by the month.

So much for the empirical evidence, which so far seems to be almost entirely negative. What about the so-called “theoretical virtues” mentioned in passing above? Are climate science’s models simple, modest, general, falsifiable? Do they suggest fruitful new lines of inquiry?

To these questions, I think we must answer no, no, no, no, and no, respectively. First, climate science’s models are not simple but spectacularly complicated. Second, what they claim and aim to achieve is strikingly immodest — to mimic the behaviour of something that looks completely random to that master pattern-recognizer, the human eye coupled with the human brain. Third, quite unlike Newton’s laws or that of natural selection, say, the models apply to a very narrow corner of nature, the climate of a single unique planet. Fourth, because the models are adjusted rather than rejected when found not to fit “data” they were designed to fit, and because they are more or less accurate rather than literally true or false, nothing seems to count as a clear “fail”. And if no such “fail” would lead climate scientists to decisively reject a model, that model is in effect unfalsifiable. Fifth, nothing in climate science’s models suggest any new ideas or avenues of exploration in any other science. It’s a rigorous road, but a barren one that doesn’t lead to any new or unexpected places.

Defenders of these models tend to be very dismissive of non-experts, as if the general public don’t have the finesse to judge what goes on in the rarified atmosphere of climate science. Hence the endless, shameless and fallacious references to themselves as “experts”, to the riffraff as thuggish “deniers”, and the ever more authoritarian assertions that the latter are just going to have to take our word for it. I think this is a terrible mistake, as it overlooks the subjective & aesthetic aspects of judgements about truth. Just as the ordinary human mind is a brilliant recognizer of patterns, it’s also pretty good at hearing the “ring of truth”. Non-specialists can see that some scientific theories have an austere beauty, that some of them “work”, and so on. Climate science has been falling out of favour with the general public because they can see it is big, ugly, complicated — and it hasn’t been “working”. So there’s nothing much to persuade us that it’s true or accurate.