Alan Solomon reminds NSA (and anyone else listening) that unbreakable codes do exist

Alan Solomon @ 9:46 pm, September 9, 2013

Computer security veteran Dr Alan Solomon shares his reflections on the NSA electronic snooping debacle that has been dominating the headlines.

Hey, hey, NSA, did you read my blog today?

I don’t think so. Not because it’s encrypted, it isn’t. Not because they can’t, because they could read it as easily as you can. But …

Well.

It’s like this.

During World War II, the British set up a huge organisation at Bletchley Park to read the German Enigma traffic. It was worth doing, because all of the communications were between military units, and many of the communications have valuable intelligence. It was even more worth while to crack Tunny, because that was the most secret communications between OKW (armed forces HQ) and the generals in the field. Hence Colossus; indeed, hence ten Colossuses.

You can see one of them in action at the National Computer Museum in Bletchly Park – recommended.

The point is, a large percentage of what was intercepted, was useful.

Now consider the internet. Quigglebytes of information every day, mostly pictures of kittens doing cute things and teenagers sending each other pictures of what they did at the party. Millions of bloggers blurting unconfirmed guesses to each other, endless Facebook posts about outings to Disneyworld and a flood of tweets about what I just had for breakfast.

Somewhere in that lot, there’s maybe a few people plotting to do something bad.

Sign up to our free newsletter.
Security news, advice, and tips.

The problem is, there’s only going to be a few such things. And some of them will be in an unbreakable code.

Many people think that there’s no such thing as an unbreakable code. To them, I have the following message:

G

You can subject the “G” above to as powerful a computer as you like, and you won’t be able to decide whether the cleartext is “Buy another cabbage” or “Please send me two dollars” or any other of an unlimited number of possible messages. That’s just one example of an unbreakable code. There’s lots of others.

If you were, for example, wanting to discuss the planning of something very naughty, you’d talk about a “stag party”. Or a barmitzvah. Or lunch. And the recipient would know what you were actually meaning.

Bad guys probably know this already. And so that reduces even more the number of messages that you might intercept that lead to bad things for bad guys. Oh, and the other thing that most bad guys probably know is that if you use the internet, or the phone system, for plotting to do bad things, you’re barmy.

So, we’re looking for a needle in a very large haystack. That’s bad enough, but one of the big rules for searching for a needle in a haystack is, “don’t start off by making the haystack a lot bigger”.

So that’s why I don’t believe the stories that are going round about the NSA reading and analysing all internet communications. It fails a test that is commonly not applied – “does this actually make sense?”

If I were the NSA, which thank the lord I’m not, sir, then what I’d do is analyse email headers. Email headers tell you who the email came from, and who it’s destined for. And those cannot be encrypted, because email works by being stored and forwarded from server to server, and that can only work if each server in the chain knows where the email is trying to get to.

Here’s a typical chain of servers that handled one of the emails I received recently:

virus-l.demon.co.uk
smtp.demon.co.uk
tch.inty.net
internal.ip.redacted (the IP is 121.74.243.168 which actually turns out to be telstraclear.net, which is Vodafone new Zealand, which fits in with what I already knew about where my correspondent lives)
drsolly.com

That’s a list of the servers that handled the email as an email. So from this, I know who sent the email (my pal Nick), and who it was for (me). And all the servers in between also know this. But there’s more servers in the chain, those that just store-and-forward packets, not caring whether it’s an email or a web access. So I did a traceroute to virus-l.demon.co.uk, and here’s a list of the servers that it passed through:

drsolly.com
se3-1-0-1-2-4-3-0.ar06.hx2.bb.gxn.net
te0-1-0-0.cr02.ts1.bb.daisyplc.net
ae0-1802-xcr1.lsw.cw.net
ae10-xcr1.lns.cw.net
xe-11-2-0-xur1.lns.uk.cw.net
warr-inside-1-g7-0-0.router.demon.net
gi6-1-0-dar3.lah.uk.cw.net
warr-inside-1-g7-0-0.router.demon.net
war1-access-1-175.router.demon.net

cw.net is Cable and Wireless, a very big noise in the internet packet transit business. So if you can persuade them to give you a copy of all their traffic, you have a copy of my emails to
virus-l.demon.co.uk.

And you could do the same with the other big packet transiters, there’s not a great many that you’d have to talk to. And the info in that header isn’t encrypted (it can’t be if you want your email to arrive) and it’s public, in the sense that it’s read by every server in the chain.

So, given that information, what I’d do is make a map of who is communicating with who.

And if I had someone who I knew was a major bad person (because some reliable source gave me that info) I’d be able to easily see who he was communicating with, and who they were communicatiing with, and so on, and maybe match that up with other known-bad-people. So you could build a map of bad-guy clusters.

And to do that wouldn’t be an awfully big job; it wouldn’t need the ridiculous amount of storage and processing power that you’d need if you tried to embrace the full haystack.

But, given the email address, how do you get the street address? Because the email is delivered to a particular IP address, and with a suitable court order, you can get an ISP to give you the real-world details of who was using that IP address at that time. Tough luck if that turns out to be an internet cafe, or a public Wi-Fi access point, but you could always do a stake-out and hope to scoop them up later.

So I don’t think that the NSA, or GCHQ are reading the unconfirmed guesses in this blog, even though I used the word “lunch”.

Alan Solomon

Alan Solomon used to run an anti-virus company. More recently, he's been crawling through tiny tunnels over inches of water to get a small plastic box. Geocaching takes him places he would never have visited in a life more ordinary. Follow his adventures at blog.drsolly.com.

6 comments on “Alan Solomon reminds NSA (and anyone else listening) that unbreakable codes do exist”

Gulraj

September 9, 2013 at 10:04 pm

'So I don’t think that the NSA, or GCHQ are reading the unconfirmed guesses in this blog, even though I used the word “lunch”.'

Yeah, but if the NSA had got you in their camp, you would say that. ;)

(Actually, I agree with the assessment. I have always said that the – almost – freely available metadata of the communications and the behaviour patterns are as valuable as, if not more valuable than, the content.)

Reply
Richard Steven Hack

September 10, 2013 at 5:09 am

Unfortunately the unstated assumption here is that the NSA is ONLY tracking people's data in order to catch "terrorists".

What if they're not?

What if they're grabbing all this crap because they want the ability now or at some future point to decide whether they're going to crack down on people who "aren't with the program" (whatever program they decide to use as the excuse)?

What if they just like the power they have by being able to grab all this crap even if they can't actually use it effectively?

If you believe the line "I'm from the government, I'm here to help you" at this point in history – even US history – you have to be staggeringly naive or on crack.

Reply
Guest

September 10, 2013 at 5:25 am

I disagree in part.

I agree that it would quite a bit more difficult to sift through the content of emails and blogs and such than it is to keep track of headers and source/destination pairs and whatnot.

Where I disagree is that if you think they're discarding the content because it would be difficult to sift through it all, you're a fool. Storage prices are very low, and are likely to continue falling. Sifting through the data at random would be like finding a needle in a haystack, but if you already know who you're looking for, then the process would be rather more like finding a book in the library.

So while it would be difficult to infer intent from the content, it is trivial to simply store every thing "just in case" and then go through it retroactively to search for all content between Known Terrorist A and Stranger B.

On a related note, did you know the NSA is building a massive new datacenter in Utah?

Reply
J Martin Ward

September 10, 2013 at 9:02 pm

I started to read this with high expectations, but unfortunately there's nothing new here. We know one of the NSA's main surveillance tools is traffic analysis. You can be quite sure that the NSA, and GCHQ here in the UK, have been recording your e-mail headers, amongst other data, for a long time. This in combination with message content analysis is what they say they must do to detect criminal and terrorist planning and activity. The fact that there are "quigglebytes of information" to filter and hunt through is what they use to justify a huge budget for supercomputers and storage. Now that it is technologically possible to winnow the haystack, and indeed to store it so that you can winnow it again and again to find more needles if you want, that's what they are doing.

The basic point here is expressed in the old adage: knowledge is power. The reason why people seek high office in government is not generally out of an altruistic desire to serve the people of their country. It is because they want power over that people. The ultimate high for Obama and Cameron and others like them is the knowledge that they are in control of everyone else; that to a greater or lesser degree they exercise authority over those around them. That is why, once in power, the motivation is to extend state surveillance, not limit it. Power is the driver, and detailed knowledge about the activities of the populace helps provide it. In the same way, the upper echelons of the NSA and GCHQ also rejoice in the influence with, and to some extent, the power over the government that they have. Any increase in power and budget that they can gain by invoking and exaggerating the risks of crime and terrorism, they will.

So it is naive to think that the NSA is not attempting to read and analyse all internet communications. The filtering,reading, and analysis are automated, and the data stored (in Utah?) against the day when its fuller analysis might be useful. Possibly you have said something in your e-mails or blogs in the past which may not be contentious now, but could be used against you (and Nick) in the future police state that we seem to be heading for. And I am quite sure that your current blog article is already filed in NSA storage.

Let's just hope that eventually the democratic process will prevail. In the meantime be thankful (a) for Edward Snowden, and (b) that strong encryption works, so we don't have to limit our messages to "G".

Reply
Sergio González

September 11, 2013 at 10:16 am

Although the NSA is supposed to be interested only in "bad people", the question is that they can afford it by overpassing average people rights.
You as the NSA did are justifying the end without taking into account the means.
Sorry, but I dont' agree.

Reply
Cody

December 5, 2013 at 9:06 pm

Re: "What if they're grabbing all this crap because they want the ability now or at some future point to decide whether they're going to crack down on people who "aren't with the program" (whatever program they decide to use as the excuse)?"

What if? As I wrote elsewhere on Graham's blog, it is absolutely the case that they are doing more than what they claim. The NSA has a very long history (certain before 9/11) with this type of thing (having fits about encryptions being hard to crack be it through brute force or otherwise). But what do you expect from governments (power corrupts and the more power they have the more they will want and the more corrupt they will become). And yes, I realise your question may have been rhetorical but I'm answering it anyway to also point out (or further validate your suggestion) that it is definitely nothing new and it's definitely more than "tracking terror" (let's be realistic: spying is an age old thing and there is a reason there are encryption export restrictions).

Re: "Where I disagree is that if you think they're discarding the content because it would be difficult to sift through it all, you're a fool. Storage prices are very low, and are likely to continue falling."

Heh. I remember when a 1GB hard drive cost something like 1000 USD. Good times, those… That aside, the point is what is most useful in order to track down others and yes email headers is one such way (although it is possible to spoof all headers and that includes adding headers that aren't standard, but that's another story entirely and it isn't necessarily fool-proof either and that's not even considering DKIM and SPF and similar standards). The thing is, SMTP is a TCP based protocol and as such it is _not_ a blind transaction: if you spoof (which is only part of another attack e.g., trust relationship exploitation) then you will need another way to send and receive data (I'll spare the details). In other words, when I see a mail server (let's say from Google) connecting to my mail server, my server sees the IP (it _has_ to or otherwise the transaction won't function – TCP and it's three way handshake, after all) and they (google's mail server) uses (as it should) the ehlo (older is helo; ehlo allows more features) command, thus identifying itself as whatever they claim. If however, they were to send:
EHLO localhost
and the IP in question is 209.85.192.198 then my server knows they are full of it (DNS PTR versus DNS A record lookup*) and the mail will be rejected (as per my config). While google doesn't send bad ehlo/helo commands many do (I see them on a daily basis, and you guessed it, spammers are the main culprit there). So if you want to track someone down via email the headers are far more interesting and that I believe is what Graham is getting at.

*
$ nslookup -type=ptr 209.85.192.198
Server: 10.0.0.1
Address: 10.0.0.1#53

Non-authoritative answer:
198.192.85.209.in-addr.arpa name = mail-pd0-f198.google.com.
$ nslookup mail-pd0-f198.google.com
Server: 10.0.0.1
Address: 10.0.0.1#53

Non-authoritative answer:
Name: mail-pd0-f198.google.com
Address: 209.85.192.198

Reply