A journey towards reliable testing in the Linux Kernel

A journey towards reliable testing in the Linux Kernel

Laura Nao
August 01, 2024

Share this post:

Reading time:

Over the past year, we at Collabora have embarked on a journey to improve the Linux kernel integration for everyone. A key part of that work is enhancing the quality of tests by developing new upstream tests and refining automated testing processes.

A significant portion of testing in the Linux kernel remains manual or only minimally automated, with many subsystems still lacking automated test coverage. This raised some critical questions: How can we alleviate the maintainers' burden through continuous integration (CI) testing? How can we better support developers by detecting and reporting regressions?

Driven by these questions, we started an effort to make CI systems more trustworthy and actively engage the upstream community in the testing process. Ultimately, focusing on test quality is key: good tests lead to reliable reports, which are essential for a strong CI process.

Collabora has been heavily involved in KernelCI over the years, working on its infrastructure, creating new tests, and running a dedicated lab to run tests on specific hardware platforms for our clients. During this time, we found many issues related to the quality of the tests. Then, with the recent launch of the new KernelCI infrastructure, we saw a chance to make improvements and focus on the reliability of the tests.

From our experience, tests that are poorly maintained or depend on unstable ABI often lead to false positives. Another major issue in kernel testing is fragmentation, where each subsystem operates its own CI. While different subsystems have specific CI requirements, essential aspects can be unified by using common, in-tree tests like kselftests instead of standalone ones.

To address these challenges, we developed a plan focused on improving the quality of tests in the new KernelCI system: rather than just enabling more and more tests, we focused on making sure that the tests meet some important quality criteria:

Coverage: Tests need to cover a wide range of functionalities
Speed/Efficiency: Tests should run quickly to avoid tying up lab devices
Portability: Tests should work on different hardware architectures and kernel versions
Consistency: Tests should give consistent results when run multiple times
Maintainability: Tests should be easy to keep up-to-date over time
Output Format Compliance: Tests should follow recognized standards for their output (e.g. KTAP/KTAPv2 for kselftests)
Community Adoption: Tests should be supported by the community, with a preference for tests integrated into the kernel tree

By focusing on tests that can be merged into the kernel tree, we grow a base of reliable tests that can be used for multiple CI systems.

Our initial focus was on addressing the bootrr test, a sanity checker for boards under automated test on LAVA which was generating many false positives in the legacy KernelCI system. This test relies on static descriptions of the DUT's hardware and drivers and therefore requires regular maintenance, especially since driver names can change over time. One goal of bootrr is to check if the peripherals on the DUT are correctly bound to their drivers. This can largely be done by using information from the device tree and ACPI tables, instead of manually describing the hardware. You can find out more about this test in our previous blog post.

This first test set the stage for creating more tests with the same approach: using generic kernel interfaces to provide extensive test coverage and reduce maintenance over time.

You can find a detailed summary of the tests we developed at the end of this post.

In addition to introducing new tests, we also focused on evaluating the quality of existing kselftests. Some tests upstream have been around for a while but still lack support for certain functions needed to run entirely in a non-interactive system. The suspend/resume test within the cpufreq selftest is one example; after suspend has been invoked, the test relies on an external wakeup event to resume. By adding RTC wakeup alarm support to the test, we can ensure it works in a CI environment without needing manual intervention or prior configuration. Our goal is to ensure the existing tests run well when integrated into a CI, document all configuration dependencies, and conform the output with the KTAP format.

All this work has already shown positive results by identifying failures and regressions. We will keep building on this to make testing in KernelCI even more reliable and effective.

You can follow our progress, including patch series and regression reports, here.

KernelCI is hosting a bi-weekly call on Thursday to discuss improvements to existing upstream tests, the development of new tests to increase kernel testing coverage, and the enablement of these tests in KernelCI. Minutes from the meetings are sent to the KernelCI mailing list kernelci@lists.linux.dev (see the notes). Reach out to us if you want to join the discussion or talk more about any of these topics. We look forward to working with the community to improve upstream tests and expand coverage to more areas of the kernel.

Join us at LPC in September to talk about generic device testing and boot time testing:

For more details, check out the links below:

Summary

Here's an overview of the tests we've been developing and contributing to upstream. All these tests have been enabled in the new KernelCI system and have shown their value by identifying failures and regressions in the mainline and linux-next kernels.

DT kselftest - [PATCH v3 0/3] Add a test to catch unprobed Devicetree devices
- Available in tools/testing/selftests/dt since v6.6
- Example regressions found with this test:
ACPI kselftest - [RFC PATCH v2 0/2] Add a test to verify device probing on ACPI platforms
- Example regression found with these test:
  - [REGRESSION] probe with driver acpi-fan failed with error -22
Discoverable devices kselftest - [PATCH v4 0/3] Add test to verify probe of devices from discoverable buses
- Available in tools/testing/selftests/devices/probe since v6.8
Error log test - [PATCH v2 0/3] kselftest: Add test to report device log errors
- Available in tools/testing/selftests/devices/error_logs/ starting from v6.11-rc1
- Example issues detected by this test:
  - [PATCH] cpufreq: mediatek: Use dev_err_probe in every error path in probe
  - [PATCH] remoteproc: mediatek: Don't attempt to remap l1tcm memory if missing
Boot time test - [RFC PATCH 0/1] Add kselftest to detect boot event slowdowns
- A proof of concept will be presented at LPC 2024 later this year.
KTAP output conformance - [PATCH v6] selftests/dmabuf-heap: conform test to TAP format output
Missing dependencies - [PATCH] selftests: iommu: add config needed for iommufd_fail_nth
Suspend/resume test in cpufreq selftest - [PATCH v2] kselftest: cpufreq: Add RTC wakeup alarm

Automatic regression handling and reporting for the Linux Kernel

DRM-CI: A GitLab-CI pipeline for Linux kernel testing

A new kselftest for verifying driver probe of Devicetree-based platforms

Automatic regression handling and reporting for the Linux Kernel

DRM-CI: A GitLab-CI pipeline for Linux kernel testing

A new kselftest for verifying driver probe of Devicetree-based platforms

Comments (0)

Add a Comment

Search the newsroom

Latest Blog Posts

Coccinelle for Rust progress report

25/06/2025

In collaboration with Inria, the French Institute for Research in Computer Science and Automation, Tathagata Roy shares the progress made…

Linux Media Summit 2025 recap

23/06/2025

Last month in Nice, active media developers came together for the annual Linux Media Summit to exchange insights and tackle ongoing challenges…

Constructor acquires, destructor releases

09/06/2025

In this final article based on Matt Godbolt's talk on making APIs easy to use and hard to misuse, I will discuss locking, an area where…

What if C++ had decades to learn?

21/05/2025

In this second article of a three-part series, I look at how Matt Godbolt uses modern C++ features to try to protect against misusing an…

Unleashing gst-python-ml: Python-powered ML analytics for GStreamer pipelines

12/05/2025

Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…

Matt Godbolt sold me on Rust (by showing me C++)

06/05/2025

Gustavo Noronha helps break down C++ and shows how that knowledge can open up new possibilities with Rust.

About Collabora

Whether writing a line of code or shaping a longer-term strategic software development plan, we'll help you navigate the ever-evolving world of Open Source.

한국의 국기 한국어 버전의 Collabora.com 보기