We're hiring!
*

Automatic regression handling and reporting for the Linux Kernel

Ricardo Cañuelo Navarro avatar

Ricardo Cañuelo Navarro
March 14, 2024

Share this post:

Reading time:

In continuation with our series about Kernel Integration (check out part 1, part 2, and part 3), this post will go into more detail about how regression detection, processing, and tracking can be improved to provide a better service to developers and maintainers.

Traditionally, regressions are detected automatically by CI systems by running the same test cases on different versions of the software to test (in this case, the Linux kernel) and checking if a test that used to pass starts failing after a specific kernel commit. In the most ideal and straightforward case, this should be enough to point to the commit that introduced the bug. The CI system can then generate a regression report and send it to a mailing list or to the appropriate maintainers and developers, if they can be deduced from the suspicious commit.

In practice, though, very rarely do we find this ideal scenario. There are several circumstances that make this process much harder in different ways:

  • Normally there's no guarantee that there'll be a test run for each repo commit, so most of the time there isn't a single suspicious commit for a reported regression.
  • Tests that involve booting and running a machine are significantly more complicated than tests that simply run a software process in isolation. The more moving parts in a setup, the more things can go wrong.
  • This means that not all test failures are caused by bugs introduced in the kernel.
  • Test code isn't infallible, they too can contain bugs that may surface for multiple reasons.

As a consequence of these, it's always necessary to have a certain amount of human intervention when reporting a regression to the community. Normally, this human intervention means doing some initial filtering of results, triaging them depending on their importance and feasibility, narrowing down the possible causes, and providing additional information that's not always evident from the initial data provided by the CI system.

There are obvious downsides to this process, the most important of all being that it's not scalable: as the test space grows, more people will be needed to keep up. Automating this process as much as possible is crucial to grow the kernel test ecosystem from a useful tool to an integral and prevalent part of the development workflow.

How can the appropriate tools help us with this task? Here are some ideas:

Post-processing of regression data

The information provided by a CI system about a regression is, most of the time, a snapshot of what happened with that test when it failed. However, further processing of that result and other neighbor data across time can reveal more information that's usually hidden to the naked eye. For instance:

  • Detection of unstable tests: when a test is found to fail intermittently over different kernel versions, there's a higher probability that the test is unstable due to a bug in the test code, a timing issue, race conditions, or other external circumstances rather than because a commit introduced a bug in every pass-to-fail transition. Implementing smart filters and heuristics may help detect this type of scenario.
  • Detection of configuration-specific, target-specific or test setup issues: collecting information about similar tests, or about the same test on different kernel configurations, or on different target platforms may highlight if a test failed on a specific scenario that could help a human inspector either filter out possible causes or narrow down the bug investigation.
  • Detection of known patterns in the test output: there's a myriad of possible post-processing options to apply to a test output log to categorize and detect specific issues. These range from the simplest text parsing to find known messages, automatically diagnosing a failure (for example, a failure to boot because of a problem mounting the rootfs, a timeout while waiting for a DHCP request, etc.), to advanced ML-based analysis to profile a bug from a console log so that it can be matched against other known instances of the same (or similar) bug in other regressions.

Tracking the regression's life cycle

Even if the data provided by the CI systems included all of these improvements, there's still the issue of following up on the status of a reported regression.

Regressions are not static entities, they have a well defined life cycle: they're detected, reported, and investigated, then they're either filed as a non-issue (false positive, intended behavior, etc) or are fixed. The fixing process involves submitting a patch, reviewing it, testing it and, ultimately, merging it and then checking that the regression has cleared up after the patch was merged.

All of this happens with almost no visibility of who's working on what and at which stage of the process a regression is in. Thorsten Leemhuis created regzbot to help with this, it keeps track of the status of reported regressions by checking mailing lists and repos automatically. A way to vastly improve this would be to integrate these features into the CI systems themselves so that anyone could get the current status of any discovered regression and update it as needed, solving common user questions like:

  • "Has anyone claimed and started to work on this regression?"
  • "Does this regression have an associated patch submitted already?"
  • "When was this fixed? where can I find a link to the patch review?"

Better integration of bisection processes with regressions

Bisections are already an important part of many CI systems and they provide an automatic way of pointing to the commit that caused a regression, assuming that the repo history is linear and that the test is stable.

In some cases, however, bisections are triggered and managed as a separate process from testing. Making sure they are fully integrated into the test generation and report infrastructure would make it easier to match a regression with its related bisection process and vice-versa. This allows anyone to check right away after getting a regression report if the regression was bisected already and if there's a good candidate commit to investigate. In the best case, if the test is stable and the bisection process is trustworthy, the results can be automatically reported to the commit author.

Conclusion

As we continue working on kernel regressions we're still finding ideas for improvements and new features. A big part of the effort is to bring these topics to the community, find a way of providing these features in a manner that's useful for all of us, and align the different projects in the ecosystem together toward the same goal. Hopefully we'll get to a point where regression checking as a process is seamlessly integrated into every kernel developer workflow.

Comments (0)


Add a Comment






Allowed tags: <b><i><br>Add a new comment:


Search the newsroom

Latest Blog Posts

The state of GFX virtualization using virglrenderer

15/01/2025

With VirGL, Venus, and vDRM, virglrenderer offers three different approaches to obtain access to accelerated GFX in a virtual machine. Here…

Faster inference: torch.compile vs TensorRT

19/12/2024

In the world of deep learning optimization, two powerful tools stand out: torch.compile, PyTorch’s just-in-time (JIT) compiler, and NVIDIA’s…

Mesa CI and the power of pre-merge testing

08/10/2024

Having multiple developers work on pre-merge testing distributes the process and ensures that every contribution is rigorously tested before…

A shifty tale about unit testing with Maxwell, NVK's backend compiler

15/08/2024

After rigorous debugging, a new unit testing framework was added to the backend compiler for NVK. This is a walkthrough of the steps taken…

A journey towards reliable testing in the Linux Kernel

01/08/2024

We're reflecting on the steps taken as we continually seek to improve Linux kernel integration. This will include more detail about the…

Building a Board Farm for Embedded World

27/06/2024

With each board running a mainline-first Linux software stack and tested in a CI loop with the LAVA test framework, the Farm showcased Collabora's…

Open Since 2005 logo

Our website only uses a strictly necessary session cookie provided by our CMS system. To find out more please follow this link.

Collabora Limited © 2005-2025. All rights reserved. Privacy Notice. Sitemap.