Why Government Computers Are Such a Mess

Every so often, I get reminded that I'm old, and I've been programming for almost 60 years, which is a long time. But 60 years in the business means I've seen a lot of things that young naïve programmers have never seen.

This comes up often when people talk about DOGE and the Wizards Academy Musk has put together to help investigate fraud, abuse, and, probably most of all, bureaucratic stupidity.

One of the things I see people — technical people, but young — saying about things like Social Security and the IRS is things like "just dump the whole database into Hadoop."

The problem with that starts with the fact that it's not in a database. It's a wildly heterogeneous collection of different databases, ISAM files, and card images, and I would bet money that a lot of it is on old 7-track tapes. Some of these are probably stored in Iron Mountain or a similar installation. Also, some of the data may still be just on paper, as, apparently, government retirement records are.

So what Big Balls and the other wizards are going to need to do to start with is find the data.

I'm willing to bet there's no single catalog of all the data sets. Having found the data, much of it is card images that almost certainly are only documented by COBOL copybooks. (Back in 2020, I wrote about COBOL for the Stack Overflow Blog when it suddenly became trendy as some of these systems desperately needed to be maintained.)

A COBOL copybook is exactly like what in other programming languages is called an "include file": code that's stored in a separate place and literally copied into the program as it's compiled. In COBOL, a copybook might look like:

01 RECORD.
05 ID PIC S9(4) COMP.
05 COMPANY.
10 SHORT-NAME PIC X(10).
10 COMPANY-ID-NUM PIC 9(5) COMP-3.
10 COMPANY-ID-STR
05 METADATA.
10 CLIENTID PIC X(15).
10 REGISTRATION-NUM PIC X(10).
10 NUMBER-OF-ACCTS PIC 9(03) COMP-3.

(This is cribbed from a GitHub repository and under an Apache open-source license.)

There's an additional complication that most mainframe data is encoded in a different coding standard — EBCDIC — than what's commonly used in the rest of the world, called ASCII.

Now, translating EBCDIC to ASCII is really straightforward, and there's no reason you can't use a COBOL copybook to build code in some more hacker-friendly language — and thinking about it, that might be a job that could be partly handed off to some AI — but then you have to validate the code, which will take time on its own.

The point is that not even Big Balls is going to be able to do it in just a few days.

The reason — or one of the reasons — DOGE needs to do this is to carry out an actual audit. Something we already know is that this data is — as we say in the business — a mess. (Actually, we would be more vehement than that, but Google doesn't like it when we swear in an article.)

Actually, it occurs to me that some of the data may not be described in a copybook, but only in the COBOL or PL/I or HLASM source code. So someone would actually have to track down the code and read it.

Anyway, as I say, we already know that the data is a mess. A Social Security Account Number (SSAN) is what we usually call a "Social Security number." It should, by definition, identify one living person uniquely. The Social Security Administration says it never reuses an SSAN, but in 2011, it started assigning new numbers randomly within limits because they were running out.

There is a whole history of this that is interesting only to nerds, and especially security nerds, but the gist of it is when you get a Social Security number, it has the familiar format 123-45-6789 with a few additional rules, and it should be uniquely yours.

And yet there are Social Security numbers that are assigned to as many as eleven different people.

How can that happen? Simple: they f***ing fouled up. (Darned Google.)

When this was first reported, a lot of people said, "That can't happen!"

Except, apparently, it can.

Now, if the data were stored in a unified database, records for a new person with an already-used SSAN would immediately lead to an error message, and the new record would be dumped on the floor to be manually fixed.

Clearly, they aren't, and at least part of the reason is that we don't actually have a unified database.

To start with, what we now think of as a "proper" relational database was only invented in 1970 and didn't get into widespread use until the late '70s, with IBM SQL/DS in 1976 and Oracle Version 2 in 1979.

Social Security numbers, however, started being issued in November 1936, and the SSA only started computerizing in 1956, with major computerization starting in 1961.

According to Musk, there are people shown as active on the Social Security rolls as old as in the 150s — that is, people born around 1875.

Now, one might be suspicious that this was actually done on purpose. I'm willing to bet it wasn't. But it was done by people who didn't have a complete picture of the whole system — I bet there's nobody who does — making changes to handle something odd that happened and getting it done by Tuesday because their annual review is coming up.

One truism in computer security, though, is if there's a flaw that can be exploited, it will be exploited. From what the Wizards Academy has already found out, it's pretty clear it has been by everyone from the guy who keeps cashing great-granddad's Social Security checks because, hell, the government sent it to actual active fraud rings.

Cleaning it up would make Hercules think cleaning the Augean stables was a cinch.

Actually fixing it will be even worse.