How to incorporate AI and Machine Learning into QA
AI and Machine Learning can make life easier for software testers, but they’re no quality-assurance panacea. Here’s what these technologies are best suited for.
AI and ML can make life easier for software testers, but they’re no quality-assurance panacea. Here’s what these technologies are best suited for — and what they’re not yet able to do well.
Artificial intelligence (AI) and machine learning (ML) have long passed the hype associated with being “The Next Big Thing.” While they aren’t the science fiction futures that novelists once imagined – that may be a good thing – AI and ML are now doing hard, hands-on work in countless industries, from healthcare to manufacturing, agriculture, finance, cybersecurity, and far more.
Their use has become particularly widespread in software quality assurance (QA) and testing. In the 2018-2019 World Quality Report by Capgemini, 57% of respondents said they had projects in place for using AI for QA and testing for software and systems, or planned them in the next year. Forty five percent were using it for intelligent automated testing.
AI and ML are so useful for testing that some of the world’s biggest tech companies, including Facebook, Netflix, and eBay, have deployed them or launched pilots for using that purpose. eBay, as one example, developed a proof of concept to use ML to automatically detect flaws on eBay web pages.
There’s good reason AI and ML have become so popular for QA and testing. Among the benefits is that AI and ML can speed up test creation, reduce test maintenance, and automatically generate tests.
But using AI and ML for QA and testing isn’t as easy as creating models, pressing a few buttons, and then letting the technologies do their work. There’s a lot of things that the technologies can’t do, and several ways they can help that you probably don’t know about. So in this blog post, I cover what AI and ML are best suited for — and what they’re not yet able to do well. Setting expectations about what’s feasible is the first and maybe most important step towards integrating them in your QA and testing.
Integrating AI and ML with human smarts
Don’t make the mistake of thinking that AI and ML can completely take over QA. Testing isn’t going to be completely autonomous. Rather, those technologies should augment human testers and intelligence.
“For now, AI and ML aren’t able to completely replace the work of people,” says Eric Sargent, vice president of sales for Functionize. “Fundamentally, there's still no substitute for human understanding of the underlying intent of a test and expected behavior of an application, but AI and ML can certainly be powerful tools to fill in some of the key gaps of the testing process that have presented challenges for some time now.”
The key to using AI and ML, Sargent says, is to recognize their strengths and the gaps they can fill in QA and testing, and then use them to do only those things.
Drawing conclusions
For example, one task they perform particularly well is visual testing, and that’s a good place to start using ML instead of human testers.
“I remember as a kid looking at the Sunday comics. There was usually a game that would show three pictures and ask which is different” Sargent says. “You’d scratch your head. After a while, you’d see that one drawing had some minute difference, such an extra freckle, and the other two didn’t. That's an oversimplification of the challenge, but generally humans are not all that adept at quickly working through complex comparative exercises. Give a machine the right kind of parameters however, it comes up with the answers much faster.”
eBay’s proof of concept using AI and ML backs that up. Using mockups of one of the site’s home page modules, the project first created 10,000 different images of the page that included different types of defects, including incorrect images, text, and layouts. It then used those images to train a model with ML to discover defects. Once the model was complete, eBay used it to check many different copies of the page for errors. The model had a 97% accuracy rate in finding defects.
Among many benefits, eBay says in its paper, is this: “A new eBay intern was able to ramp up in a matter of a day or two and start generating test data when training a ML model. Previously, some [quality engineering] teams would require a few weeks of daily work in order to become familiar with the domain’s specifics and the intricate knowledge of our webpages.”
Say what you want the test to do
AI and ML can also make it much easier for humans to build tests. They can allow testers to use plain English to describe the test they want to create. Behind the scenes, AI and ML do the work of translating that request into a fully-functioning test. So rather than write test code, a tester can write, “Verify that all currency amounts display with a currency symbol,” and a test is created to accomplish that. Making this even more powerful is that AI and ML can combine multiple plain English statements to build lengthy, complicated tests.
AI and ML can also outperform humans at doing root cause analysis. They can much more easily break down the sequences of events in an error in an application, and pinpoint exactly where there are coding issues. For example, they can recognize that whenever a specific data variable is inserted, a failure occurs five to ten steps later.
Used properly, AI and ML can also do an excellent job at regression testing. “As applications become more complex and release cycles accelerate it's simply not possible for humans to effectively keep up with the demands of running an maintaining their regression tests,” Sargent says, “Automation takes some of the burden off of execution. But by incorporating AI and ML, tests become more adaptable to minor changes. They self-correct or ‘self-heal’ as necessary. That task could take hours for a human to complete, as they work through triage of a failed test.”
Facebook has used ML in a unique way for regression testing: ML determines which regression tests should be used for any particular code change. Doing so cuts down on the number of regression tests that need to be run. The company says with ML, it only needs to run “a small subset of tests in order to reliably detect faulty changes...enabling us to catch more than 99.9 percent of all regressions before they are visible to other engineers in the trunk code, while running just a third of all tests that transitively depend on modified code. This has allowed us to double the efficiency of our testing infrastructure.”
Overall, Sargent says, AI and ML excel at the debugging process when you need to “parse out the code and figure out where the steps failed. Where AI and ML help most is in how much capacity they have for processing and calculating, and they take it much further than traditional automation.”
Just as important as knowing when to use AI and ML is knowing when not to use it. They don’t do well at exploratory testing scenarios and similar tasks that require unique human thinking, Sargent says.
“We, right now, are living in a world of narrow intelligence when it comes to AI and ML,” Sargent explains. “That means AI and ML are only as smart as the data they’ve been given and the parameters and the rules they’ve been provided with. Any sort of application or scenario where there's need for a dynamic outcome that isn't easily programmable is just not practical at this stage. Ultimately, the more dynamic and changeable the scenarios and standards you’re using, the more difficulty AI and ML will have making the best decisions.”
Still, Sargent says, when AI and ML are used for what they’re best at, they can evaluate failures quickly, fix them, and re-enter the code into the pipeline.
“That eliminates many of the bottlenecks many organizations are facing today when trying to shorten release cycles” Sargent says. “So using AI and ML makes a lot of sense — as long as you use it for the tasks for which it’s best suited.”
Following the execution of any functional test, Functionize provides extensive performance insights—without any additional steps. Read the white paper.