Is the Science Ever Settled? Theories, Hypotheses, and What Science Really Does


Any time you write an article talking about the limits of science, as I did in my recent “The Importance of Not Being Certain,” someone will invariably show up to say “evolution is just a theory.” This pretty much always means the person saying it doesn’t have a very good grasp of either “evolution” or what a “theory” is, and so I thought, “okay, that’s a column.”

Let’s look into Darwin’s theory of the origin of species, and in the process let’s look at how science works to build confidence in something that is “just a theory.”

Here’s what Darwin actually said: species originate through natural selection among different characteristics.

He didn’t have access to the nearly 160 years of research that followed. He just made an argument from his observations at the time. Ernst Meyer (this is cribbed from Wikipedia) summarized Darwin’s argument as follows:

  • Every species is fertile enough that if all offspring survived to reproduce the population would grow (fact).
  • Despite periodic fluctuations, populations remain roughly the same size (fact).
  • Resources such as food are limited and are relatively stable over time (fact).
  • A struggle for survival ensues (inference).
  • Individuals in a population vary significantly from one another (fact).
  • Much of this variation is heritable (fact).
  • Individuals less suited to the environment are less likely to survive and less likely to reproduce; individuals more suited to the environment are more likely to survive and more likely to reproduce and leave their heritable traits to future generations, which produces the process of natural selection (inference).
  • This slowly effected process results in populations changing to adapt to their environments, and ultimately, these variations accumulate over time to form new species (inference).

Two things to notice here: first of all, this is a theory: “a supposition or a system of ideas intended to explain something.”

This theory takes the form of an argument: “a sequence of logically connected assertions that lead to a conclusion.”

Some of them are well-established, so Meyer called them “facts,” and some are called “inferences,” which is to say they’re conclusions drawn from the observations before them.

We’ll come back to that in a second, but I want to point out something else: at no point does Darwin make an argument for where life “came from,” how what we call life originated. So everyone winding up to throw “but evolution doesn’t explain the origin of life” at me, just stop. Darwin and evolutionary theory don’t explain that, because it’s not their job.

Those inferences are the key. Every one of those steps should be examined closely, but the ones labeled as “fact” are pretty non-controversial.

Now we’re coming to the way science is actually done. There is a whole topic of the “philosophy of science” and lots of people making their livings writing about what science is and does, but for actual working scientists there are some pretty common guidelines.

A very influential paper on how to learn to do science is John Platt’s paper “Strong Inference.” While it’s controversial among philosophers of science, it’s not primarily a philosophy paper, it’s a “How To” paper.

Before we talk about strong inference, though, we need to take a side trip for some terminology. A hypothesis for which there exists a conceivable experiment that could prove it wrong is falsifiable. Performing that experiment, if successful, is a falsification. So, if my hypothesis is “Coffee is instantly fatal” it’s falsifiable, because you can drink coffee and if you don’t die, it’s not true. So I take a drink of coffee … yep, I’m still here, and so I’ve falsified the hypothesis.

So, Pratt recommends a process that goes basically like this:

1. You observe something.

2. You make several hypotheses. In this case, a hypothesis means a statement about the observation that proposes an experiment that would show the hypothesis is wrong -- in other words, it’s falsifiable.

3. You perform the experiments. Every hypothesis that is actually falsified, you discard.

4. With what you’ve learned, you go back to 1 and repeat.

When you’re down to just one hypothesis, or at least when you’ve excluded some hypotheses, you’ve got a result, which you publish in a form that lets other people replicate your experiment and verify your results. Then what do you do? Remembering that science is never settled, you or someone else goes back and does it again.

Think about what happens over time. As this process goes on, you accumulate more and more explanations for some original set of observations that you’ve attempted to falsify, but haven’t been able to falsify. The more of those you have, the greater your confidence that you’ve got at least a good partial explanation.

Now, let’s look at those inferences that are the more controversial parts of the argument.

A struggle for survival: I think this one is pretty obvious, but let’s think about why it’s obvious.

Anyone with an outdoor cat will eventually see evidence that some birds or field mice or lizards lost their struggle for survival. But how can we falsify the assertion that there is no struggle for survival? Well, we’d expect to see a lot of animals who die peacefully of old age; instead, we see predation and disease. So, the more times we look for peaceful chipmunks dying in bed surrounded by loving family and friendly neighbors, and don’t find them, the more we are confident that “nature red in tooth and claw” is the way to bet.

That one is easy, but that’s why it makes a good first example. We’re already quite used to the notion that other things die in order for some to survive as we eat our corned-beef sandwich. But now let’s look at the other inferences.

 Individuals less suited to the environment are less likely to survive and less likely to reproduce: This one is just on the edge of being circular, because the definition of being suited to the environment is that you survive and reproduce. So, let’s look at it with an analogy.

Let’s say we have a thousand coins. I’d use silver dollars, but they’re hard to find, and I’d use quarters, but this is a low-budget experiment: let’s assume a thousand pennies. We set up some kind of machine that can flip all the pennies simultaneously, and we declare that each penny is one individual of some species. If the coin comes up tails, it’s dead, and we remove it from the game.

Flip once, there are roughly 500 left (Usually. There’s a whole ’nother column needed to talk about the Gambler’s Ruin and why this is only an approximate statement.) Flip twice, and there are 250 left, and so on.

But now, let’s slip in some ringers, unfair coins: some always come up heads, some always come up tails. Flip the pennies once, and all the tails-up ringers die, along with half the fair coins. Eventually, if you flip the coins long enough, there will be nothing left but the all-heads ringers. That’s natural selection.

Now, let’s add a bit to our analogy. After each flip we make exact duplicates of the pennies that came up heads -- the ones that didn’t die -- enough to fill in the board so we always come back to 1000 pennies.

Flip again, and half the fair pennies survive, but all the duplicate all-heads ringer pennies survive. The longer this goes on, the more ringers there are, until after some length of time you have all pennies that always come up heads. (Strictly mathematically, you can’t be positive the fair pennies will ever be eliminated; the probability is always greater than zero that there’s at least one survivor. But thats certainly the way to bet.)

However, notice something: this happened entirely because we arbitrarily chose that a coin that comes up heads survives, and the others die. If we’d made the opposite decision, the all-tails pennies would have been better suited to their environment,  and eventually all the pennies would be all-tails pennies.

The final inference:

This slowly effected process results in populations changing to adapt to their environments, and ultimately, these variations accumulate over time to form new species.

Our same little penny-flipping machine shows us exactly that: eventually the population of mixed all-heads, all-tails, and fair pennies becomes a population of all-heads. The fair pennies become extinct. (I’ll leave it as an exercise to figure out where the original all-heads pennies came from.) The population has “evolved” until it’s more fit for an environment in which coming up tails means death.

We shouldn’t reason too much from analogy, though; we ought to see if we can bring this back to real life, which gets complicated, because the idea of a “species” is more slippery than you’d think.

The traditional definition is “a group of living organisms consisting of similar individuals capable of exchanging genes or interbreeding.” But this definition has some surprises: a jackass and a mare can interbreed and produce a mule. Mules are usually infertile, but sometimes a female, a “molly mule,” can become pregnant. The famous ligers and tigons -- lion-tiger crosses -- also show that sometimes apparently pretty distant species are inter-fertile. Going the other direction, populations that are separated may develop different characteristics but still interbreed quite happily.

So, let’s think of a different hypothesis. We now have much more advanced methods of exploring genetics: we can sequence whole genomes of organisms, reduce these to sequences of symbols, and then compare them using computer methods. When you do this, you get trees of more and more distant common ancestors.

So, my siblings and I have one closest common male ancestor, my father. We also turn out to be the only direct descendants of my grandfather Hobson, but then we share our great-grandfather Jefferson Davis Martin (born in Georgia in 1861 and guess which side his family was on) with a whole mess of other people -- Jeff had 13 or 14 kids. This process of associating organisms by the common ancestors is called phylogenetics, and the descendants of a common ancestor form a group called a clade.

If species arise through natural selection, you would expect all the examined species to form a single phylogenetic tree, or at most a very few trees. So, to falsify the notion of natural selection leading to new species, you would examine the genomes of a lot of organisms. If they don’t form a clean tree, that would seem to falsify this notion of natural selection.

Sure enough, the phylogenetic trees of big groups of organisms have been constructed, and they do form simple “clean” trees. What’s more, organisms that are actually very far apart -- like humans and fruit flies, or humans and broccoli -- still share a significant number of genes and very similar chemical mechanisms that support life, which suggest we have common ancestors, too. So, this attempt to falsify natural selection fails.

Now, let’s go back through this and summarize: Darwin’s idea of evolution, that species arise through natural selection over time, requires us to gain confidence in three inferences Darwin made:

First, that there is a struggle for survival, and some individuals don’t live long enough to reproduce. We can appeal to experience to tell us this is true.

Second, that some individuals are more likely to survive and reproduce in an environment, and so are better suited to that environment, which we call “natural selection.”

Third, over time, this natural selection can change the characteristics of a population. Our penny-flipping thought experiment shows how this might happen, but more importantly, we then looked back at the actual genes of various organisms and saw that they do appear to be related genetically as this would predict.

That means that we are reasonably confident in Darwin’s whole argument.

It also means that if we can perform an experiment that falsifies any of those steps, we lose confidence in Darwin’s argument.

As things stand today, Darwin’s notion of evolution, especially when we extend it with the things we’ve learned in the intervening 160 or so years, has stood up very well to attempts to falsify it.

That’s what science as a process really is: that process of observing, proposing explanations, and then trying to knock those explanations down. Eventually, you have only a few explanations left standing: our best explanations for what we observe in the real world. It’s that collection of best explanations that we call “science.”