Climategate Computer Codes Are the Real Story
The "Read Me" file of a harried programmer who couldn't replicate the scientists' warming results.
November 24, 2009 - 4:09 pm
So far, most of the Climategate attention has been on the emails in the data dump of November 19 (see here, here, and here), but the emails are only about 5 percent of the total. What does examining the other 95 percent tell us?
Here’s the short answer: it tells us that something went very wrong in the data management at the Climatic Research Unit.
We start with a file called “HARRY_READ_ME.txt.” This is a file containing notes of someone’s three-year effort to try to turn a pile of existing code and data into something useful. Who is Harry, you ask? Clearly, a skilled programmer with some expertise in data reduction, statistics, and climate science. Beyond that I won’t go. I’ve seen sites attributing this file to an identifiable person, but I don’t have any corroboration, and frankly the person who wrote these years of notes has suffered enough.
The story the file tells is of a programmer who started off with a collection of code and data — and the need to be able to replicate some results. The first entry:
1. Two main filesystems relevant to the work:
Both systems copied in their entirety to /cru/cruts/
Nearly 11,000 files! And about a dozen assorted “read me” files addressing individual issues, the most useful being:
(yes, they all have different name formats, and yes, one does begin ‘_’!)
Believe it or not, this tells us quite a bit. “Harry” is starting off with two large collections of data on a UNIX or UNIX-like system (forward slashes, the word “filesystem”) and only knows very generally what the data might be. He has copied it from where it was to a new location and started to work on it. Almost immediately, he notices a problem:
6. Temporarily abandoned 5., getting closer but there’s always another problem to be evaded. Instead, will try using rawtogrim.f90 to convert straight to GRIM. This will include non-land cells but for comparison purposes that shouldn’t be a big problem …  noo, that’s not gonna work either, it asks for a “template grim filepath,” no idea what it wants (as usual) and a serach for files with “grim” or “template” in them does not bear useful fruit. As per usual. Giving up on this approach altogether.
Things aren’t going well. Harry is trying to reconstruct results that someone else obtained, using their files but without their help.
8. Had a hunt and found an identically-named temperature database file which did include normals lines at the start of every station. How handy — naming two different files with exactly the same name and relying on their location to differentiate! Aaarrgghh!! Re-ran anomdtb:
Okay, this isn’t so unusual, actually, but unless you document and describe your file structure, it’s pretty much opaque to a new reader. Still, Harry presses on:
11. Decided to concentrate on Norwich. Tim M uses Norwich as the example on the website, so we know it’s at (363,286). Wrote a prog to extract the relevant 1961-1970 series from the published output, the generated .glo files, and the published climatology. Prog is norwichtest.for. Prog also creates anomalies from the published data, and raw data from the generated .glo data. Then Matlab prog plotnorwich.m plots the data to allow comparisons. First result: works perfectly, except that the .glo data is all zeros. This means I still don’t understand the structure of the .glo files. Argh!
Poor Harry is in the first circle of programmer hell: the program runs fine; the output is wrong.
He presses on:
17. Inserted debug statements into anomdtb.f90, discovered that a sum-of-squared variable is becoming very, very negative! Key output from the debug statements:
some test output…
forrtl: error (75): floating point exception
IOT trap (core dumped)
..so the data value is unbfeasibly large, but why does the sum-of-squares parameter OpTotSq go negative?!!
This is not good — the existing program produces a serious error when it’s run on what is supposed to be the old, working data. Harry presses on, finding a solution to that bug, going through many more issues as he tried to recreate the results of these runs for the data from 1901 to 1995. Finally he gives up. He has spoken to someone about what should be done:
AGREED APPROACH for cloud (5 Oct 06).
For 1901 to 1995 – stay with published data. No clear way to replicate process as undocumented.
For 1996 to 2002:
1. convert sun database to pseudo-cloud using the f77 programs;
2. anomalise wrt 96-00 with anomdtb.f;
3. grid using quick_interp_tdm.pro (which will use 6190 norms);
4. calculate (mean9600 – mean6190) for monthly grids, using the published cru_ts_2.0 cloud data;
5. add to gridded data from step 3.
This should approximate the correction needed.
Catch that? They couldn’t recreate the results, so they’re going back to their published data for the first 95 years of the 20th century. Only …
Next problem — which database to use? The one with the normals included is not appropriate (the conversion progs do not look for that line so obviously are not intended to be used on +norm databases).
They still don’t know what to use for the next several years. Harry gives up; it’s easier to write new codes.
22. Right, time to stop pussyfooting around the niceties of Tim’s labyrinthine software suites – let’s have a go at producing CRU TS 3.0! since failing to do that will be the definitive failure of the entire project.
This kind of thing is as fascinating as a soap opera, but I want to know how it comes out. Near the bottom of the file, I find:
I am seriously close to giving up, again. The history of this is so complex that I can’t get far enough into it before by head hurts and I have to stop. Each parameter has a tortuous history of manual and semi-automated interventions that I simply cannot just go back to early versions and run the update prog. I could be throwing away all kinds of corrections – to lat/lons, to WMOs (yes!), and more.
The file peters out, no conclusions. I hope they find this poor guy, and he didn’t hang himself in his rooms or something, because this file is a summary of three years of trying to get this data working. Unsuccessfully.
I think there’s a good reason the CRU didn’t want to give their data to people trying to replicate their work.
It’s in such a mess that they can’t replicate their own results.
This is not, sadly, all that unusual. Simply put, scientists aren’t software engineers. They don’t keep their code in nice packages and they tend to use whatever language they’re comfortable with. Even if they were taught to keep good research notes in the past, it’s not unusual for things to get sloppy later. But put this in the context of what else we know from the CRU data dump:
1. They didn’t want to release their data or code, and they particularly weren’t interested in releasing any intermediate steps that would help someone else
2. They clearly have some history of massaging the data — hell, practically water-boarding the data — to get it to fit their other results. Results they can no longer even replicate on their own systems.
3. They had successfully managed to restrict peer review to what we might call the “RealClimate clique” — the small group of true believers they knew could be trusted to say the right things.
As a result, it looks like they found themselves trapped. They had the big research organizations, the big grants — and when they found themselves challenged, they discovered they’d built their conclusions on fine beach sand.
But the tide was coming in.