Boom! When QA (could have) saved the day!
History is littered with catastrophic software failures. Often they happened because of poor testing and QA. Learn how bad it can get when your QA fails.
History is littered with cases of catastrophic software failure. And in most cases, those failures would have been avoided if there had been a proper QA process. Back when I was at college, my software engineering lecturer had a background in programming telephone exchanges. As he said, a telephone exchange cannot afford to have any errors. As a result, the software was tested incredibly rigorously. But here we look at some places where testing fell far short.
Space, the final frontier for testing?
The International Space Station relies on a huge amount of code—easily more than 20 miles were you to print it all out. That code is all that is keeping the astronauts safe, constantly adjusting the speed and path of the ISS to avoid space debris, maintaining their living conditions, monitoring their health, etc. Any error would be catastrophic. So, clearly, space-bound software must be subject to the most exhaustive testing right?
Gone in (37) seconds
Well, let’s turn the clock back to June 1996. Ariane 5 is the latest and greatest heavy-lift rocket developed by the European Space Agency. At this time, it is the most powerful rocket in existence. But 37 seconds after lift-off, the rocket self-destructs, taking with it several experimental satellites that had been hitching a free ride. In the subsequent investigation, engineers find that the error was caused by the control software, inherited from the earlier Ariane 4 rocket. The specific problem came from a software module written for Ariane 4’s inertial navigation system. This module wasn’t actually needed for Ariane 5, but it was included anyway. The specific module was responsible for correcting any horizontal bias in the navigation. In short, it was meant to keep the rocket pointing upwards in the correct direction.
Including this module in Ariane 5 control software was entirely unnecessary. However, the decision was made to retain it and to keep it active during launch and for 40 seconds after. Due to constraints in the hardware for Ariane 4, this reading was converted from a 64-bit floating-point number into a 16-bit integer. This was done for 7 readings from the inertial navigation system. For four of these readings, the software included overflow protection, but for the other three, including horizontal bias, that was deemed unnecessary. Unfortunately, a few seconds after take-off, the Ariane 5 rocket started to drift off course. After a few more seconds, the horizontal bias reading overflowed the 16-bit integer and caused a processor lock. The eventual result was that the rocket couldn’t be certain where it was pointing. At this point, a safety override cut in and caused the rocket to self-destruct. Boom! No more rocket, and well north of $100M going up in smoke!
Jumping to the modern-day
Most software nowadays isn’t controlling rockets costing hundreds of million dollars. However, software now controls almost every single aspect of our lives. Power grids rely on software. Most planes are flown largely automatically, with some serious consequences when the software behaves unexpectedly. You might think that by now, software testing would have become foolproof and perfect. Sadly, you would be mistaken.
How to lose half a billion dollars
Early this year, Citibank lost a key court case relating to a blunder that had seen it mistakenly pay off $900M on a loan, instead of the interest of $7.8M. The mistake happened because a subcontractor didn’t know how to correctly use a highly confusing UI. The intention was to pay out the interest owed on a loan to one of several underwriters. However, Oracle Flexcube, the software being used, doesn’t allow for partial payments like this. Instead, the entire loan had to be repaid on paper, then reconstituted minus the amount paid. The majority of this money was then meant to be diverted to an internal “wash” account, ensuring that the money never left Citibank. However, the contractor wasn’t aware that doing this correctly required the same account number to be entered in three fields. Moreover, the bank’s internal three-stage approvals process failed to spot the error. While some of the lenders agreed to repay the money as it was a mistake, others held on to some $500M. And the judge ruled that they were within their rights to do so.
When random isn’t so random
Software problems aren’t limited to poor UI, nor to applications created by huge multinationals. Often, they creep in at the very bottom of the software stack in libraries or even hardware components that are almost universally used. Just recently, it was found that an error in a crucial true-random number generator has left billions of IoT devices vulnerable to attack. These devices relied on a hardware true random number generator (TRNG) to create keys used for securing their communications. In this case, the TRNG wasn’t generating enough entropy, leading to keys that were less secure than they should be. The problem wasn’t the actual TRNG. Rather, it was in how it was called, or more correctly, how often it was called. TRNGs rely on various sources of entropy to create random numbers. For instance, measuring the background electromagnetic interference (the white noise you hear on an old-fashioned radio between stations. However, TRNGs can only generate a limited amount of entropy. If you call them too often, the entropy runs out. In some cases, this was even leading to keys containing pure zeros. The upshot? Billions of devices in people’s homes are now vulnerable to attack.
A bleeding heart
One of the most famous errors in cryptography was the so-called heartbleed bug. This affected OpenSSL, an open-source security library used by millions of systems globally. The bug allowed attackers to sequentially request the entire contents of the system memory, 64kB at a time. The problem was related to a missing bounds check. This allowed an attacker to request blocks of memory well beyond the bounds of the file they were meant to be accessing. The bug was found in 2014 but had existed for some two years in the wild. A hotfix was issued on the same day the bug was announced, but large numbers of systems remained vulnerable for some time after. It isn’t certain if any hacker exploited the bug before it was announced. But over the days and weeks afterward, they certainly did.
What lessons should we learn for software testing?
All four of the incidents above were caused by very different factors. In the case of Ariane 5, some old code was reused without sufficient checking. In the Citibank case, a confusing UI and poor processes led to an expensive mistake. The heartbleed bug shows the risks of relying on open-source libraries. And the TRNG problem relates to a combination of a lack of understanding of hardware with a lack of error and sanity checking. However, all four of these cases could and should have been avoided with better testing.
Testing isn’t just about functionality
All too often, people assume software testing just checks for functionality. But acceptance testing should also examine usability and try to find out if there are ways users can be creatively stupid! This is far easier if you can dynamically record how users interact with the UI in order to spot repeated patterns or issues. But even without that ability, you should never expect people to do as you expect! Or as Douglas Adams once put it “A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools.”
Expect the unexpected
When creating tests it’s easy to check the most likely things. What happens when someone enters the wrong username? How does the system respond when there is a missing entry in a form? However, it is far harder to test the corner cases. For instance, the strange bugs that happen when a user clicks in a certain sequence. Or the odd effects that can happen if a phone is caching API responses. Here is where skilled testers really come into their own.
Invest as much in testing as you are prepared to lose
QA is often the poor man of software development. Typically, they are the smallest team and have to fight for all the resources they need. But if QA misses a bug, consider how expensive that could be for you. A great analogy is bike locks. If you were to leave a two thousand buck road bike locked in LA with just a $10 lock, you expect it to be stolen. If you really value your bike, you should invest in a really good lock. Similarly, if a bug could cost your company millions, you should take that into account when assigning QA budget.
How Functionize can help
Functionize is an AI-powered smart test automation platform. This means it can deliver three key benefits that ensure you test better:
Free up QA resources
One of the biggest problems for automated testing is the time spent creating and maintaining tests. Often, this dominates your QA team’s time. In the worst case, teams drown under uncontrollable test debt. Functionize slashes test debt, allowing resources to be diverted to higher-value tasks like exploratory testing.
Thorough UI testing
Traditional scripted UI tests can only verify what they are told to. By contrast, we take a visual testing approach. This means our system automatically checks what changes on your UI between test runs. If something changes unexpectedly, you will get a warning. This won’t catch bad UX design, but it will catch unexpected changes in the UI.
Test more features
Our AI-based system is able to test a lot of things that traditionally had to be checked manually. For instance, verifying file downloads (ideal for downloaded tickets, receipts, etc.). Or checking that two-factor authentication is working without having to hack your test scripts as you would in Selenium.
If you want to see the only truly modern test solution in action, book a demo today.