Update: The original article I wrote about this monumental cock-up had some inaccuracies, specifically about how the data was lost. Thankfully, the BBC’s Leo Kelion has got to the bottom of it.
None of this changes the fact that using Excel in any part of this process is absolutely barking.
Apologies to readers for my error. My original article is included below for transparency.
When Microsoft created its Excel spreadsheet software it created limits as to just how big a spreadsheet could be.
Specifically, according to Microsoft’s own documentation, they set the following limits:
1,048,576 rows by 16,384 columns
That doesn’t feel daft to me. Why would anyone ever want so many columns on a spreadsheet? The last column is labelled XFD if you’re curious.
And yes, if you think of any spreadsheet you’ve ever used in your working life, you would certainly anticipate that you might need many more rows than you would need columns.
But those are clearly still huge numbers. If you wanted to collect data on anything that came anywhere close to those limits you would probably want to use proper database software, not an Excel speadsheet, right?
Not so, it would seem, when Dido Harding is in charge of England’s Test & Trace operation.
Some 16,000 Coronavirus cases reportedly went missing after the Excel spreadsheet they were being recorded in reached its maximum limit, and did not allow the automated process to add any more names.
As a result, it’s possible that some people who might have been infected by COVID-19 may not have been properly traced in a timely fashion.
WTAF is any track and trace data doing anywhere near a spreadsheet? If you designed a process to misplace data it would start with “we use a spreadsheet”.
— Holographic nano-layer catalyst expert Stockley (@MarkStockley) October 5, 2020
British Prime Minister Boris Johnson told the BBC this weekend that the problem was:
“a failure in the counting system which has now been rectified.”
Sure enough, the missing figures are now being included in the UK’s daily Coronavirus statistics, showing a sizeable leap due to their failure to be included in previous days.
There’s a lesson to be learnt here, and that’s use the right tool for the job. Excel is for spreadsheets, not for running databases.
You’re running a Test & Trace operation, not putting together a wedding present list.
How anyone could imagine that using a spreadsheet with a patient per column was a wise idea is beyond me.
But don’t fear – apparently a solution to the problem was quickly found.
They’ve decided to split it up into multiple Excel spreadsheets…
Found this article interesting? Follow Graham Cluley on Twitter or Mastodon to read more of the exclusive content we post.
13 comments on “UK loses 16,000 COVID-19 cases due to Excel spreadsheet snafu”
Dido Harding's learnt the best way to avoid a SQL Injection is to not use SQL!
… or use the right people and processes to develop systems and applications. The fact that a spreadsheet was being used suggests 'end-user computing', which generally bypasses IT project discipline and oversight as well as development processes and methodologies.
Has anyone seen this spreadsheet? How do you know it was 1 row per patient? If it was, what happened to the 42,000 dead people?
There is no logical reason why an Excel spreadsheet can’t be used to aggregate and analyse infection rates from hundreds of sources over hundreds of days. And with pivot tables that information can be represented in any number of ways.
This just sounds like someone put an artifically low limit on a “sum” function, which would lose lots of data entered on other worksheets.
Great solution though1
Let's hope the UK Treasury doesn't use Excel or they might lose £Billions – Oh, wait a minute…..
Dido Harding lost the data of 150,000 Talktalk customers, so losing only the data of 16,000 Covid-19 cases is a significant improvement.
The BBC is reporting that the Excel sheet was limited to 65k rows – which suggests it was Office 97, not the current version!
Where have all the billions spent on Test and Trace gone if they can't afford a new Office licence?
Who believes the BBC…
It's surprising to see in this day and age still the use of Excel is predominant. For large data sets, being able to search quickly there are a lot of products that can easily be used.
It's simply taking a long time for users to switch to a much productive systems other than Excel.
@Graham thank you for the article and as you rightly say in today's lesson. Use the right tools for the job.
My client forwarded this article with a note that they could have used our services (Trunao).
Collecting data on a population of 66 million (plus or minus) kinda needs a database back end. Then you can use Excel to as a User tool to massage it into a pretty graph or whatever.
Whoever thought of putting all this application data into a spreadsheet obviously did not know what they were doing.
This department needs some serious programming discipline.
Did no one ever hear of ITIL? [EXPLETIVE] even I got through that.
I disagree. Excel can be a very good database IF you know how.. we set up huge databases all the time for our clients
Dido Harding is the antithesis of Midas – unfortunately the things she touches seem to turn to 💩
I hope you were kidding with that last line