Regression tests constitute a crucial part in the Continuous Integration pipeline. By building and executing tests, software developers get feedback that their code is correct and have not regressed the system.
Regression testing works on the presumption that test outcomes are deterministic; in other words, the tests results are consistent given that the test and Code Under Test (CUT) remain unmodified.
However, tests that have non-deterministic outcomes, i.e., sometimes pass/fail on the same CUT, also known as flaky tests. Flaky tests introduce a degree of uncertainty by yielding inconsistent results, thereby posing a hindrance to the reliability of regression testing outcomes and Continuous Integration systems.
Today flaky tests is most often disregarded and silenced by developers when bugs could be lurking in production code.
As Djikstra put it: “Program testing can be used to show the presence of bugs, but never to show their absence.” — Edsger W. Dijkstra
The idea here would be to create a flaky test tracker that would be incorporated in the CI/CD pipeline to isolate and detect flaky tests as well as analyze where the passed/failed run diverged in order to help developers debug flaky tests.
Putting AI to the job, it would also be able to deduce possible root causes such as test order dependency, asynchronous issues or even I/O related, thus helping developers finding the errors and improving their code base.
What do you guys think? Is this something other developers would like to see and use in their companies?
// Sokrates