We know, with great certainty, that the overall average temperature of the Earth has warmed by several degreees in the last 400 years, since the end of the Little Ice Age. Before that was a period called the Medieval Warm Period; before that was another cold period; and back at the time of the Romans there was a long period that was significantly warmer — Southern Britain was a wine-growing region. What we’re a lot less certain about is “why?”

Of course, the “why?” here has been, shall we say, pretty controversial. It’s worth wondering about the controversy and about the social mechanisms through which science is done — I wrote about them during the Climategate controversy as the “social contract of science” — but that’s not what I want to talk about today. Instead, let’s talk about how a scientist thinks about these sorts of questions and arrives at new answers. Back in grad school we called that “doing science,” and it was something everyone liked doing and wished they could be doing instead of whatever they actually were doing, like faculty meetings and refereeing papers.

The process of “doing science” is something you usually learn more or less by osmosis, but there are some good hints around. One of the best is a paper from the 16 October 1964 issue of Science, “Strong Inference” by John R Platt. Let’s say we have some phenomenon of interest, like global warming, or high blood sugar, or that damned yellow patch in my lawn. We want to know why it happens. Platt’s strong inference describes the process we should use when “doing science” as:

  1. We generate a number of alternate explanations, hypotheses, that might explain the phenomenon.
  2. For each hypothesis, we come up with an experiment which will prove the hypothesis wrong. That is, not one that “proves the hypothesis,” but one which, if successful, would disprove or falsify the hypothesis. (Sir Karl Popper argued in his book The Logic of Scientific Discovery that this falsification was the core of scientific knowledge.)
  3. We do the experiments. If an experiment falsifies a hypothesis, we discard it ruthlessly. Then we go back to (1) and try again.

A lot of times, the rub — and the really creative thinking — comes in from finding the right experiment. Richard Feynmann was known for an ability to see right through a problem to a simple and elegant experiment that would disprove a hypothesis. He demonstrated this during the review following the Challenger disaster. You may remember that the launch happened on a very cold morning in January; less than two minutes after launch the Space Shuttle Challenger blew up, killing all seven astronauts.

The question, as always, was “why?”

YouTube Preview Image

From films and debris, it appeared that the solid rocket motors had failed first, sending a blowtorch of hot gas into the external tank, which then exploded. The solid rocket motors were built of a stack of components containing the solid fuel, which were then joined to make the whole rocket motor; it appeared, in fact, that one of the joints had failed.

One proposed explanation was that the cold has made the O-ring seals at the joints stiff. During a public, televised hearing, management people from the solid rocket manufacturers discounted this idea. Feynmann, who was one of the members of the all-star panel doing the investigation, quietly got a salt shaker and a glass of ice. They had a sample of the O-ring material that had been provided as a prop for the hearing. Feynmann put the salt into the ice, making a concentrated salt solution with a temperature much lower than the normal freezing point of water. Feynmann, without making a fuss about it, dropped his sample of O-ring in the water and let it chill.

Here’s the strong-inference part of this. The Thiokol managers’ hypothesis was that the O-ring material remained “sufficiently” flexible at the temperature it would have reached on that unusually cold Florida morning. Feynmann’s experiment simply said “okay, so let’s get a piece of this stuff cold and see what happens.”

The answer, which Feynmann proceeded to demonstrate in a nationally televised hearing, was that the stuff got to be very brittle. Feynmann took what had been a soft, rubbery material at room temperature, and it broke like glass.

So much for the managers’ hypothesis.


As we said, we’re pretty confident that there has been significant warming since the Little Ice Age. The controversy around “climate change” or “global warming” is all about what’s happening; the UN-approved explanation is that humans are releasing gases into the atmosphere that cause less heat to be radiated out into space, and thus causing the average temperature to rise, what’s called the “greenhouse effect.”

Aside: Now, just to try to forestall one of the usual threads of argument, there really is very little question the greenhouse effect actually exists — the natural temperature of a rock in orbit around the Sun at the same distance as the Earth is nearly -40°. So let’s not have the “but there’s no such thing” argument, okay?

If we plan another game of strong inference, what we want is an explanation for the rise in temperature, and particularly for the amount the temperature has gone up. There are a whole lot of different things that might explain it:

  • Human-generated greenhouse gases might be doing it.
  • Human changes in land use — like lots of asphalt highways, which are pretty black — might be causing the Earth to absorb more heat.
  • There might be measurement error — we have to estimate the temperature based on thermometers around the world. Some of these thermometers themselves have had some pretty significant changes in their environments, like having a parking lot built around them, that would make the temperature higher locally. This would increase the measured average temperature, which would make it harder to find a natural explanation.
  • There might be variations in the Sun’s output that cause the changes.
  • There could be other factors than greenhouse gases that cause the Earth to retain more heat. (One interesting possibility that’s being explored is that more cosmic radiation might be causing cloud cover to change.)

Performing a good experiment to test each of these hypotheses is difficult: we can’t just make a spare Earth with no people, no roads, and a different influx of cosmic rays along with a different Sun maintaining standard conditions. So we have to use other methods.


Into this comes modeling. One way of looking at any hypothesis is that it’s a model of the way you think the phenomenon is behaving: “If I’m right, then this will happen.” Medical experiments share a lot of characteristics with these climate experiments, and in particular, you can’t take a lot of human patients, use them as lab rats, and mess with them just to see what happens.

What’s more, both patients and planets have the troublesome characteristic that they tend not to be very predictable — no one really exppected snow in Boulder in the middle of May like happened this year, even though it certainly had happened before. If there is a lot of variation happening with a lot of different causes, we have to use statistics to try to tell the different causes apart.

So let’s imagine we’re testing a new hay fever drug, an early summer topic near and dear to my nose. I take the new drug, and my hay fever gets better. Cool.

But what if I would have gotten better anyway, just because the cottonwood trees have stopped emitting cottonwood fluff or something?

We want to know what would have happened if we did nothing. We call this treatment of no treatment the null hypothesis. Statistically, we want to know if we can tell if what we did had an effect.

Notice this isn’t the same as saying “tell if it had an effect or not.” The new hay fever drug might have worked, but I might have at the same time caught a cold — so the hay fever drug might have been working, but my nose could still be stopped up and my eyes itching.

(Or half the freaking state of Colorado could be on fire and it’s the smoke that’s bothering me. But the Colorado fires are another column.)

YouTube Preview Image

So now let’s talk about climate. (You thought I’d never get there, didn’t you?) We know there’s warming; there is a hypothesis that this is caused by an increase in the amount of greenhouse gases, and particularly in carbon dioxide, in the air as a result of human action. The core of this hypothesis is a model, a picture of how we think things work, that computes what the average temperature should be based on the increase in CO2. These models all center around a value — either assumed in, or computed in some way — of how the temperature responds to changes in CO2 concentration. This is called sensitivity.

The models that have been proposed by the “mainstream” climate science community all have substantial sensitivity to CO2 concentration — which is as it should be. That’s the hypothesis. But since this is a statistical property we’re measuring, they also have a range of values that are plausible predictions. This is stated as a confidence interval, which just means “we’re willing to bet you 20-1 that the temperature will stay in this range based on the assumption that human-released CO2 is causing the warming.”

These models are now running into a problem, though — the warming appears to have, well, stopped. For 17 years. Even though the amount of CO2 in the atmosphere has gone up.

In global warming, the null hypothesis would be that the “treatment” has no effect, or in other words that the human-caused increase in CO2 is overwhelmed by other effects. And again, note that this isn’t the same as saying “it has no effect,” just that we can’t tell if there has been an effect.

At which point, we can bring in this lovely little chart, stolen from The Economist magazine.


The dark line is the actual measured temperature; the light blue band is the “95 percent confidence interval,” which is to say, that band of 20 to 1 odds. The hypothesis from the models lays down a bet of about 20 to 1 that the temperature will stay inside that light blue band.

The real temperature, however, says otherwise. It’s either already out of that confidence interval or it’s very close to the edge.

Why? We don’t know. Time for some new hypotheses.


images courtesy shutterstock / Samot / Kingan /  sextoacto /