Alan Woodward is a visiting professor at the University of Surrey’s department of computing.
In this article he shows how to crack the code-cracking puzzle set by GCHQ, in its hunt for new recruits.
We decided at the time not to reveal the answers to the puzzle, but enough time has now passed – so we can explain (to those of you who weren’t able to crack the code) how to do it.
Don’t fool yourself into thinking that by following these instructions you’ll be able to land yourself a job as a UK government cyber-expert, as they’ll surely require you to complete a much more rigorous test than this before they give you a parking space at their headquarters in Cheltenham.
The GCHQ recruitment puzzle begins on the page that announces the competition: https://canyoufindit.co.uk.
“Our new challenge is to find and solve 5 codes we have hidden around the web. For anyone able to rise to the challenge and find all the codes, you’ll join an elite community of people with some of the specific skills we look for at GCHQ.”
The first puzzle is on the page you see in your browser. It contains a series of characters:
AWVLI QIQVT QOSQO ELGCV IIQWD LCUQE EOENN WWOAO LTDNU QTGAW TSMDO QTLAO QSDCH PQQIQ DQQTQ OOTUD BNIQH BHHTD UTEET FDUEA UMORE SQEQE MLTME TIREC LICAI QATUN QRALT ENEIN RKG
To a code breaker there are a few features that immediately strike you about this text:
- It is displayed in groups of five characters. This is a historic trend used in part to stop any particular frequency or word matching to be made available by the format in which the message was transmitted. It is probably most famous from the many encrypted Enigma messages that on sees written about. In essence, you can ignore it as it is unlikely to provide you with anything useful for decrypting the message.
- There are a large number of “Q’s”. This is unusual as Q is an infrequently used letter in the English language, and assuming the message is in English, the Q’s probably server some function. Such infrequently used characters are often used as spaces. So, it is likely that you can ignore the actual spaces used to create the five letter groups and assume that the Q’s are the actual spaces.
- For anyone who has dealt with ciphers the number of characters is of interest. Here we have 143 characters, which just happens to be the product of two prime numbers: 11 and 13. This is a big clue. What you are supposed to do is rearrange text as shown into a grid that is 11 by 13 characters:
A W V L I Q I Q V T Q O S
Q O E L G C V I I Q W D L
C U Q E E O E N N W W O A
O L T D N U Q T G A W T S
M D O Q T L A O Q S D C H
P Q Q I Q D Q Q T Q O O T
U D B N I Q H B H H T D U
T E E T F D U E A U M O R
E S Q E Q E M L T M E T I
R E C L I C A I Q A T U N
Q R A L T E N E I N R K G
Now if you read down the columns, using Q’s as spaces you see the following message emerge:
A COMPUTER WOULD DESERVE TO BE CALLED INTELLIGENT IF IT COULD DECEIVE A HUMAN INTO BELIEVING THAT IT WAS HUMAN WWWDOTMETRODOTCODOTUKSLASHTURING
This form of encryption is a transposition cipher. It has many forms but the one used here is one of the simplest. It has a long history and before electronic encryption devices it, and its variants were the basis for many secret communications.
If you take the web address at the end of the message and write it in more familiar form:
www.metro.co.uk/turing you have the next stop on your journey, plus you have the answer to the first clue which is “Turing”.
When you arrive at the new webpage you see that there is a file available to download.
It helpfully has the extension “key” so even before opening it one can assuming it is some form of encryption key. Download and open the file and you see the following:
—–BEGIN RSA PRIVATE KEY—–
—–END RSA PRIVATE KEY—–
If you take this on face value it is a RSA Private Key from an RSA Public/Private key pair.
What is a lot less clear is what it is supposed to be used to decrypt. The page contains no other text or files that would seem to be usable with this key. You have to assume the key itself has something more to tell you.
So, the starting point of most forensics is to open the file in a hex editor and see what it might reveal.
Even when you remove the header and footer (-----BEGIN RSA PRIVATE KEY----- and -----END RSA PRIVATE KEY-----) it doesn’t tell you much.
As is common practice for transmitting keys, the file is encoded using Base64. There are lots of online Base64 decoders into which you can place this key for decoding (remembering to remove the header and footer first). I used http://www.base64decode.org/ which gave me:
As you scan through the decoded string of characters you see a string embedded in it which starts to look familiar: ww.whtsisilguoectsrehsri.eocu./klbtehcel y
And if you do a simple swap of alternate characters you find you have another web address:
Sure enough this is the next stop on the journey, and “Bletchley” is the next answer for the main page:
The newly-revealed webpage contains a new stream of characters:
It looks very much like a modern cipher stream so one has to assume there is a key for decrypting it which of course we were just given on the previous page. So, let’s revisit the key we were given.
Files that begin and end with these words have a very definite format. It is known as PKCS#1 and comprises the following elements:
1. ASN.1 Header
2. Algorithm Version
4. Public Exponent
5. Private Exponent
8. exponent 1
9. exponent 2
Each of these can be extracted manually by partitioning up the hex format of the key. If you do that you see that the web address www.thisisgloucestershire.co.uk/Bletchley is in the component known as Prime 2.
This will give you all of the components of the Private Key by analysing the key file with the simple command:
openssl.exe RSA –in comp1.key –text
which outputs the following:
Now find yourself a RSA decryptor. I used one written by Nathan Michaels.
Hence the decoded hex string is:
20 20 20 20 20 20 20 20 77 77 2e 77 68 74 72 65 67 65 73 69 65 74 2e 72 6f 63 75 2e 2f 6b 6e 65 67 69 61 6d 30 32 33 31 20 20 20 20 20 20 20 20
If you put this back into your favourite hex editor you again see a web address that has had each character swapped:
So, swapping back the characters in the string ww.whtregesiet.rocu./knegiam0231 gives you the URL: www.theregister.co.uk/enigma2013 Hence, you have the next stop on the journey and, following the pattern where the last part of the URL is the answer for the home page, your next answer is Enigma2013.
This next page presents something new:
The new element is a picture. For anyone who has visited Bletchley Park will recognise the machine shown as Colossus, the first computer which was used to crack the Enigma code in the Second World War.
As before let’s take this image file and open it in our hex editor:
At first it appears to be a standard jpeg file with the usual header that you would expect. However, as you scan down the file you notice there is another jpeg file header.
Someone has added a second image to the end of the main image. Using your hex editor it’s a simple matter to delete everything before the second jpeg header, save the edited file and try to open this newly shortened file.
What you see is this:
As before, you have your next answer (Colossus) and your next port of call.
This page presents you with a URL directly, and in solving puzzles sometimes the obvious answer is the right answer. If you use this web address it takes you back to the start page, and if the pattern is maintained your final answer should be “Secured”.
Returning to the start page and typing in your answers:
then reveals that you’ve followed the trail correctly and you can provide GCHQ with your contact details if you wish to be considered for a job.
If you managed to follow the trail correctly then congratulations.
If not, then even following through with answer sheets like this one can help you understand the mind-set you need to work on the more complex area of communications security.
I’m sure there will be more opportunities to put what you have learned to use.
Found this article interesting? Follow Graham Cluley on Twitter to read more of the exclusive content we post.