Get PJ Media on your Apple

PJM Lifestyle

by
Charlie Martin

Bio

June 27, 2013 - 8:30 pm
Page 1 of 5  Next ->   View as Single Page

snowden5 snowden4 snowden3 snowden2

Editor’s Note: If you have not yet made Charlie Martin one of your regular, Read-Everything-They-Write authors, then I submit this article from him for your consideration. Over the past 6 months Charlie has emerged as one of PJ Lifestyle’s most engaging, intelligent contributors. His 13 Weeks Self-Improvement Experiment is giving birth to a movement. What began as Charlie trying various methods to improve his health so he could live longer has now set the theme for each Saturday with Sarah Hoyt, me, and, beginning this Saturday, Rhonda Robinson also following his lead. So I’ve asked Charlie to start writing more on other subjects too. In addition to his 13 Weeks reports on Saturdays, also tune in each Sunday for his Buddhism reflections and Thursdays for his science geek articles. This is the first in his science series and I can’t wait to see what he comes up with next.

David Swindle

*****

So, there’s this NSA thing. Since the stories about the NSA, Edward Snowden, PRISM, and so on have broken, there has been more misinformation, disinformation, bad information, speculation, ignorant commentary and flat out nonsense going around than any topic in recent memory. And to tell you the truth, I’ve been working on this article for two weeks and never finishing because there is always one more howler. Let’s see if we can clear some of this up.

One of the things I’m going to complain about, by the way, is the number of authoritative opinions being offered by people who clearly don’t actually know much more about it than what they’ve read from other people’s poorly informed speculation. Someone might reasonably then ask why they should believe me? Especially since recently I seem to have been mainly a diet and health blogger. So let’s just summarize.

I started working on defense systems in the late 70s, when I got a polygraph clearance — an “EBI,” or “extended background check” clearance — and went to work on some very sensitive stuff and no I can’t tell you what even today. But I’ve spent a fair bit of time overseas, “covert” under the law that Valerie Plame certainly wasn’t covert under, and I’ve worked directly with both the CIA and the NSA on many occasions.

Then when I went to graduate school, I got involved in DARPA-funded security research, where I came up with the original architecture for a highly secure version of the X windowing system, and helped write the Navy’s handbook for evaluating secure and trusted systems under the old DoD TCSEC — the “Orange Book.” I’ve been a security subject matter expert on projects for Sun Microsystems, StorageTek, the Navy, and a half-dozen major banks and Wall Street firms, and I’ve got about a dozen patents either issued or in process, many of them having to do with security “in the cloud,” cryptography, and Big Data.

Basically, secure systems, cryptography, and Big Data have been my day job for most of the time since about 1979.

Comments are closed.

Top Rated Comments   
And I think if people are going to complain about the contents of an article, they should read it.
41 weeks ago
41 weeks ago Link To Comment
Try harder, FC. It's worth reading.
41 weeks ago
41 weeks ago Link To Comment
A good description of the overall process and good definition of metadata. Here are a couple of other thoughts from someone that builds statistical models for speech processing:

1) I am almost sure that the NSA is also building models to detect "unusual" call patterns from the metadata, perhaps run only on calls that have one terminus out of the U.S., as is authorized by FISA. The definition of unusual would be those patterns that tend to predict forthcoming terror attacks. I'm sure you could build such a model that would have a tolerable ROC curve (balance of false alarms with missed detections).

2) It would be valuable in building such a model to include _all_ calls' metadata, not just calls with a foreign terminus. You would presumably use at least some of the US-only metadata records as part of the "baseline" model - i.e. calls in that category would be deemed "usual". In general, the modern pattern recognition algorithms can always benefit from having more relevant data, as long as they have good heuristics for cleaning it up.

3) I can think of no reason why these detection algorithms would need to be built and run on unencrypted metadata. As long as the encryption is consistent, the algorithms could be built and run on encrypted data (area codes and country codes might need to be readable, but not the 7-digits, times, specific-locations, etc.) They would work just as well, certainly within the range of the inherent uncertainty.

4) This suggests that a less invasive approach to this program would be to have the network providers encrypt the metadata at the source prior to sending to the government databases. Then, when an "unusual" pattern fires, the NSA needs to go to the FISA court to get a "decryption" warrant which they present to the phone/internet company, who then decrypts the records specified in the warrant.

Now, I know that, given enough encrypted/decrypted pairs, the smart guys at the NSA would eventually be able to break to code, and decrypt everything themselves. Or, if they really wanted to, given area/country codes and known phone-number and calling patterns, they could eventually figure out the encryption without decrypted records. You could work around that by forcing the entire database to be updated with a new encryption key after each warrant is served.

But, you could never absolutely prevent this. At some point, you need to have confidence that your government is not dedicated to violating the 4th amendment (and FISA). And, even if you believe that NSA scientists would not spend valuable time doing this, you would need to have faith that their politician bosses were not going to give them a "do it or leave" ultimatum.

I don't have such faith, and the behavior of the past two administration has done nothing to dissuade me of that.

But, more importantly, if we were willing to clearly name the enemy in this war, and use all necessary force to decisively annihilate it, then none of these programs would be necessary.
41 weeks ago
41 weeks ago Link To Comment
All Comments   (137)
All Comments   (137)
Sort: Newest Oldest Top Rated
Yottabytes of metadata. Greek to me.
41 weeks ago
41 weeks ago Link To Comment
It's simple question time.
Todays simple question is directed at those who have issues with the NSA in general and these programs in particular

What Do You Want To Do?
Substantive real world answer only please. (well deserved) rants against Obama, & follow the Constitution don't count.

I've been asking this question since this story broke, and so far, I've not gotten a real answer.
41 weeks ago
41 weeks ago Link To Comment
Me? I want to have bacon for breakfast.

As far as the NSA thing, it's not what I want to do, it's what I want them not to do.
41 weeks ago
41 weeks ago Link To Comment
Suppose that Americans make a 100 billion phone calls a day, and each call requires 1000 bytes of metadata (that’s probably a lot more than needed). That’s roughly 100 terabytes of metadata per day, or 365 x 100 (35600) terabytes per year. That amount of data can be stored in a space about as big as my front closet.

The news is filled with articles about the enormous data storage facility that NSA is constructing in Utah. So far, everything we know that they are grabbing, if stored efficiently, would fit in a one-car garage. Is this another colossal government waste of space, or could NSA maybe not be telling us the whole story?

41 weeks ago
41 weeks ago Link To Comment
Heh, in the day job we build a box that will store 4 times that in a 4U rack-mount box. We like Big Data.
41 weeks ago
41 weeks ago Link To Comment
OK, so that means those huge buildings have even more space. What else is NSA packing in there, in addition to "metadata". I'd postulate that they are capturing and storing digital content of all kinds, in truly massive amounts, and including the actual content of cellphone communications (which is digitized by your phone, ready for storage), emails, text messages, Facebook, Twitter, YouTube videos, etc. When the FBI, CIA, NSA gets curious about an individual, they can obtain a secret warrant to search this huge database for every event you generate, including the content, not just the metadata, for several years. Pretty soon, you'll have a DNA profile as well, if you ever get your blood drawn (or even your stools checked) at your doctor's office.
41 weeks ago
41 weeks ago Link To Comment
Something in between, I suspect. Capturing *everything* would be exabytes (10^18 bytes) a year; I worked it in a comment in the PRISM piece and that would mean consuming the entire production of disk drives, and building a whole new major power plant just for the NSA ever two or so years. http://pjmedia.com/lifestyle/2013/07/04/what-is-prism/
40 weeks ago
40 weeks ago Link To Comment
Dropped a decimal point. You'd need eight of our boxes. Still less than one rack.
41 weeks ago
41 weeks ago Link To Comment
41 weeks ago
41 weeks ago Link To Comment
If you actually go to there you will see that 'peta' is a Greek word like 'meta'.
41 weeks ago
41 weeks ago Link To Comment
I ordinarily would not take on someone like Charlie Martin, but these are trying times. And I am trying. And, yes, 'found wanting' is an easy answer. The ongoing discussion of metadata is much more important than most people understand.
41 weeks ago
41 weeks ago Link To Comment
WOW, so your THAT Charlie Martin. Should have recognized your picture.

I have been thinking about TRANSEC in the modern Internet age for a few years now. This whole Metadata thing has gotten me thinking about it again. E.g., how much intelligence can be deduced fro the metadata of an xml document?
41 weeks ago
41 weeks ago Link To Comment
Well, it -- he said like a good consultant -- depends. But think about an OPML document -- the metadata, the structure of the XML, gives a pretty clear picture of the document, enough so that in something like an outliner there may be little there *but* metadata.
41 weeks ago
41 weeks ago Link To Comment
Mr. Charlie Martin, I implore you to explain why the NSA program to collect specific instances of telephone numbers, times and locations is not a data (versus metadata) collection program? I have heretofore respected you much as an excellent scientist and writer!
41 weeks ago
41 weeks ago Link To Comment
And I implore you to figure out that you're arguing with a generally-accepted technical definition. By definition metadata is indeed data: it's data about data. You're arguing that I'm saying ground beef isn't beef because it's ground.
41 weeks ago
41 weeks ago Link To Comment
How many billion dollars is our government going to spend collecting yottabytes of our metadata?
41 weeks ago
41 weeks ago Link To Comment
It's take a yotta them.

A yottabyte is a billion petabytes, so that's still probably beyond even the government.
41 weeks ago
41 weeks ago Link To Comment
Looks like it's time for another Professor PJ Dumb Post Award.

Look, poopsie. It's called metadata because it's data about the data. That doesn't make it any less "data". I assume you can manage a computer, since you're commenting here, so go look at any folder you like in your file browser, Windows Explorer, Finder, or at the command line get a directory listing.

You get a listing of file names. Say one of those names is nsa.txt, and the contents of it are the words "Come get me NSA, this is my data." The words in the file are the data. The name of the file, which is kept in a special kind of file called a "directory" is the metadata. Why is it "metadata"? Because it doesn't tell you what's in the file, it tells you the name of the file. If I change the file name to "myNameIsCharlie.txt"; the content of the file doesn't change.

Now, is the metadata important too? You bet. Among other things, in six months when I'm looking for my NSA column, I'm going to have one hell of a time finding it named "myNameIsCharlie.txt";. If you read the entire article, you'll see that I even cited a paper showing surprising results of what could be learned just from the metadata, and that I'm against collecting it.

Really.

41 weeks ago
41 weeks ago Link To Comment
Data is actual stuff. Metadata is theoretical stuff. Actual phone numbers are data, actual locations are data, actual transaction times are data.
41 weeks ago
41 weeks ago Link To Comment
41 weeks ago
41 weeks ago Link To Comment
Whoo, does this comment make me dumb? Hey Mr. Genius, perhaps you can figure it out all by yourself.
41 weeks ago
41 weeks ago Link To Comment
Yes. Congratulations.
41 weeks ago
41 weeks ago Link To Comment
Pretty easy dude. It is totally easy to be totally wrong.
41 weeks ago
41 weeks ago Link To Comment
First of all, thanks for the entertainment you two. With all due respect however, this is an argument primarily over semantics. When it comes to phone calls and emails, there is more "data" in the metadata than in the content.

As CM implies however, metadata holds the real value and for the government to dismiss it as unimportant to our privacy is disingenuous at best. By relating and triangulating metadata, their analysts learn more about us than our actual words would tell them.

So, "all other things being equal" I guess you both have a point.
41 weeks ago
41 weeks ago Link To Comment
"metadata holds the real value and for the government to dismiss it as unimportant to our privacy is disingenuous at best. "

Yes and no. Yes, metadata is tremendously important, but "the real value""? As compared to the content of the conversations? I don't think that's valid.

But are they being disingenuous? Well, no, they are lying, but I think we agree on that.

They want us to think it's no big deal, but it IS a very big deal. The metadata is capable of telling them a LOT about me and you and every other person they want to intimidate.

It's still metadata.
41 weeks ago
41 weeks ago Link To Comment
When you have one bad guy (as is pointed out in the article), the metadata is the starting point to finding the members of the organization. You don't (yet) care about call content. You care about things like how many times a guy called another guy, how regularly it happened, where he called from, how long he was on the phone, etc. You can quickly learn without analyzing call content who this guy has close relationships with, who are mere acquaintances, and who this guy has regular contact with.

FROM THERE you proceed to call content.
The dangerous part is (again, as is pointed out in the article) when the government suddenly puts members of the "Insert Your Organization Here" organization on their hit list. In other words, when YOU are on the political enemies' list. That's why the Constitution has the Fourth Amendment to begin with. Because next thing you know, you're in a prayer group, and you get rounded up along with your wife, kids, parents, in-laws and your insurance agent.
40 weeks ago
40 weeks ago Link To Comment
Probably the first order of business is 'everybody brush up on their Greek.' There are a number of Greek prefixes or words whose meanings I assume I understand. I'm wrong at least half the time.
41 weeks ago
41 weeks ago Link To Comment
One of my grandmothers studied Greek. She told me me that walking 'on' the water could also be translated as walking 'in' the water. Not being familiar with Gretian, I demur.
41 weeks ago
41 weeks ago Link To Comment
Thank you MRG01. I agree with you that Charlie Martin has a point. I also agree with you that I too have a point.
41 weeks ago
41 weeks ago Link To Comment
I'm glad I am old. You who are younger and buy into this stuff will get the life you deserve. So-called metadata collected by your government will eventually enslave you in the name of keeping you secure.
41 weeks ago
41 weeks ago Link To Comment
I take it all back. If you think I'm young, you're okay.
41 weeks ago
41 weeks ago Link To Comment
Charlie,
VERY insightful and educational article. I do have a few questions, however.
1) Considering that even the companies subject to it can't have a copy, would Snowden, even as a system administrator, have the proper clearance to access a FISA court order, even if it was stored in a NSA database?
2) Are you familiar with Verizon's Business Solutions clients being subjected to Chinese hacking? It was a huge story in tech and small business circles since the hacking involved all levels of businesses, from small to large and from online storefronts to companies with government contracts on intelligence programming. I found it highly interesting that the FISA court order was dated 4/25/2013 and was on Verizon Business Solutions accounts (copy here: http://www.guardian.co.uk/world/interactive/2013/jun/06/verizon-telephone-data-court-order). My interest was piqued when doing research on the topic and realized that the Chinese hacking's coverage started on 4/23/2013, although it was quite an active topic in Verizon and small business forums previously.
Do you, in your expertise, see any potential connection? If not, I'll back off my own suppositions and guesswork on that angle.
41 weeks ago
41 weeks ago Link To Comment
By definition, if you get cleared for SYSTEM HIGH, you're cleared to read anything in the system. Theoretically, that's undesirable, because of compartmentalization, but as I say, it's also a pain to have to find the particular admin cleared for a particular compartment when someone has lost their password or something. Plus, sysadmins are relatively expensive; what happens iof one quits and you can't get permission to hire another.

I hadn't thought about the Verizon/China connection, I'd be interested in your speculations myself.
41 weeks ago
41 weeks ago Link To Comment
My speculations are almost completely laid out above.
The FISA court order is for Verizon Business Solutions accounts, not all Verizon accounts. It is one of the three subsidiaries of Verizon. The other two are Verizon Wireless and Verizon Telecomm (I think. I get confused when it gets too far up the food chain.)
At any rate, the accounts to be pulled are only business accounts from a separate division of Verizon. Everything that I've read on it where the author of the piece understands the difference goes into speculation on whether all Verizon accounts were under different court orders, but that is all it is: speculation.
The fact that China has been hacking US government (local, state and federal) institutions as well as private business accounts is pretty well known. However, Verizon apparently was targeted specifically earlier this year, with massive hacking attempts (and successes) coming out in the news around the end of April. This is a blog but it links to multiple articles so shortcuts me having to link them here. http://readwrite.com/2013/02/19/is-there-nothing-we-can-do-to-stop-chinese-hackers#awesm=~oaqRY2lOdmN9lo
So, the fact is that Verizon Business Solutions accounts were being deliberately targeted by Chinese hackers and the FISA court order clearly naming that subsidiary of Verizon as the subject of the order in the same time frame.
There is also nothing that I have seen that would contradict a possible conclusion that Verizon went to the government for assistance and the court order is just legal cover for Verizon (most businesses require subpoenas when LEO needs to access client information even when they are the victims) when the government agreed that the hacking is/was an issue it needed to address. By law neither Verizon nor the NSA could publicly confirm such a scenario if it were true.

I just had a thought: What if the goal isn't to either reveal that the NSA is spying on US citizens (which has not been proven as yet) or even that the NSA is spying on foreign nationals (which any nation with a net GDP of over $100 does) but to bring down the huge service and software industries here in the US? They ARE our economy in large part now.

Not that I approve of them collecting all of my data. IMO it's the equivalent of a store clerk constantly following me around the grocery store, picking items up out of my shopping cart and telling me they have a coupon for it or a different brand. Creepy.
41 weeks ago
41 weeks ago Link To Comment
Point of order here, Verizon Wireless is a separate company, not a subsidiary. It's a joint venture between Vodaphone and Verizon.
38 weeks ago
38 weeks ago Link To Comment
I guess this means there is actually a secret national gun registry in development (based on background checks and credit card transactions)? And since the NSA (under the executive branch of the government) is now the digital archive of everything digital, domestic and foreign, are those digital records now subject to discovery motions in criminal court proceedings by the defense? So, with fast furious in mind, you think there might be a digital record of what the president knew or didn't know about Fast and Furious, or Benghazi? Do you really think Snowden will make it back alive for trial?
41 weeks ago
41 weeks ago Link To Comment
Except it would be ILLEGAL and prosecutable under NSA laws. If you find any documented confirmation, I suggest you hie immediately to your Congress man/woman and/or someone you trust on the Intel Committee(s) who have oversight over such potential abuses.
41 weeks ago
41 weeks ago Link To Comment
Those are very interesting questions, aren't they?
41 weeks ago
41 weeks ago Link To Comment
Mr. Martin, thank you for your insight on this topic. I had a rudimentary understanding of the structure and potential capabilities of our communication systems and this fleshed it out nicely.
41 weeks ago
41 weeks ago Link To Comment
1 2 3 4 5 Next View All