Mesa CI and the power of pre-merge testing

Mesa CI and the power of pre-merge testing

Deborah Brouwer
October 08, 2024

Share this post:

Reading time:

Mesa is an open source 3D graphics library that implements a wide range of APIs including OpenGL, OpenGL ES, Vulkan, OpenCL, and hardware-acceleration interfaces like VDPAU and VA-API. It provides hardware drivers for many vendors: AMD, ARM, Broadcom, Imagination, Intel, Qualcomm, NVIDIA, and Vivante/VeriSilicon. It supports layered drivers to run APIs on top of other APIs and in virtualized environments. With software drivers like Lavapipe and LLVMpipe, it can run graphics on CPUs without dedicated GPU hardware.

Not only is Mesa large and complex, but development happens very quickly. Stable versions with bug fixes are released about every two weeks and development versions are released every three months with thousands of new commits. Mandatory and reliable pre-merge testing is essential to Mesa’s rapid development model. Its continuous integration (CI) system has been so successful that it now provides scripts and setup for DRM-CI to carry out pre-merge testing for the DRM subsystem of the Linux kernel.

So, what do we mean by pre-merge testing? Mesa developers work in parallel in their own forks of the Mesa repository hosted by Freedesktop. Once a change is ready for community review, a developer opens a merge request against the main branch of Mesa. The affected codeowners will discuss, acknowledge, and/or review the changes. Then, after the review is complete, anyone with a "Developer" or higher role in the project can initiate a merge by assigning the merge request to Mesa's marge-bot.

Marge-bot will ensure that, before any changes are merged, the new code passes CI testing. This pre-merge testing is available, on demand, 24/7, providing access to real hardware distributed globally on CI farms. Given the critical role of CI in Mesa's development, supporting it is a large community effort involving many companies and developers. Members of Collabora’s Mesa CI team: Antonio Ospite, Daniel Stone, Deborah Brouwer, Guilherme Alcarde Gallo, Sergi Blanch Torne, and Vignesh Raman, are dedicated to this community effort of keeping Mesa CI running.

Developer forks

Let's take a closer look at how pre-merge testing is implemented for Mesa. Say, for example, a developer has forked the Mesa repository and made a change to Intel’s Gallium driver Iris. Assuming the developer has configured their fork to use the same CI/CD setting as Mesa itself, pushing a commit will automatically create a new pipeline in their repository. Opening the pipeline tab may be disappointing, however, as the pipeline sits empty and grey. The developer has encountered the first line of defense protecting Mesa’s CI infrastructure: developers need permission to run Mesa’s CI pipelines. Access to the infrastructure is controlled by membership in the CI-OK group. Since running a pipeline essentially gives developers free access to run any code changes on any of Freedesktop's GitLab runners that are registered with the Mesa project, these resources need to be protected from abuse.

However, once the developer has permission to run a pipeline, pushing changes to a branch will still not start any CI activity. All of the jobs sit dormant waiting for manual action by the developer. This is another mechanism designed to protect Mesa’s CI resources since not every code change, particularly during development, needs to immediately trigger CI jobs. Developers may push code changes to share and collaborate during development, or just to save their work against local failures, and it would be a waste of resources to run pipelines automatically on development forks. If the developer does want to initiate some of the CI jobs on their fork, the GitLab user interface provides a manual button to initiate a job. Here is an example of the debian/x86_64_build-base job running after the developer starts it manually:

This manual action works fine for jobs that don’t depend on other jobs to run, but a closer look at the pipeline shows that many of the jobs simply can’t be started manually. These jobs depend on a series of other jobs completing successfully before they become available. Often the most interesting jobs are driver-specific and depend on multiple other jobs completing before they can be run. If our hypothetical developer is working on a change to the Iris driver they might want to make sure that they haven’t caused any regressions on devices running the Gemini Lake platform. The iris-glk-deqp job will run dEQP quality and conformance tests on an HP Chromebook located in Collabora’s Lava lab, but first all of these pre-requisite jobs need to be run:

alpine/x86_64_lava_ssh_client → iris-glk-deqp
debian/x86_64_build-base → kernel+rootfs_x86_64 → iris-glk-deqp
debian/x86_64_build-base → debian/x86_64_build → rustfmt → iris-glk-deqp
debian/x86_64_build-base → debian/x86_64_build → toml-lint → iris-glk-deqp
debian/x86_64_build-base → debian/x86_64_build → debian-testing → iris-glk-deqp

Since it is very dull to click through the user-interface and wait for each of these jobs to complete, developers can use the command-line tool ci_run_n_monitor to automatically start one or more test targets and just those jobs necessary to support those target jobs. A simple command might be:

bash
./bin/ci/ci_run_n_monitor.sh  \
--pipeline-url https://gitlab.freedesktop.org/dbrouwer/mesa/-/pipelines/1260557 \
--target "iris-glk-deqp"

This command generates a nicely pared down pipeline with just the essential jobs running for targeted tests:

Since developer forks don't run pipelines automatically, the pipelines will include unrelated jobs, but the ci_run_n_monitor tool makes it easy for developers to target just the jobs that they want to test.

Merge requests

Once a developer opens a merge request against the main Mesa repository, the CI system is more selective about which jobs to add to the pipeline because all of these jobs will need to pass before a proposed change is merged into Mesa. While a developer's fork for an Intel driver will indiscriminately include about 250 jobs from the entire Mesa project, the developer's pre-merge pipeline includes only about 85 jobs for Intel hardware and the build environment that supports it. Here is the pre-merge pipeline for the same changes as above:

Selecting only the essential jobs is necessary for efficient development, but it's not sufficient. Since the goal is to complete the CI testing quickly enough to keep pace with Mesa development, merge requests can't back up waiting for CI to run. Even with a reduced jobs set such as above, the sheer number of tests would quickly overwhelm CI infrastructure. For example, just the single iris-glk-deqp job runs dEQP and Khronos Conformance tests for OpenGL ES 2.0, 3.0, 3.1, and OpenGL 4.6 for a total of over 130 thousand discrete tests.

One method of reducing the pre-merge runtime is to run only a fraction of the total tests otherwise available in caselists. The iris-glk-deqp job sends only one of every six possible tests for OpenGL ES 2.0; one of every eight tests for OpenGL ES 3.0 and 3.1; and every other test for the remaining standards. Mesa CI also uses the deqp-runner tool to parallelize tests across a single system. Deqp-runner itself accepts fractional arguments, and in the case of iris-glk-deqp it runs every other test, ultimately winnowing down the run time to about 6 minutes for 14 thousand tests. Furthermore, as long as sufficient CI farm resources are available, most jobs can run in parallel across different machines. So while iris-glk-deqp is running, so are at least 15 other jobs working simultaneously.

The run time for a pre-merge pipeline varies depending on the scope of the proposed changes and the number of jobs that need to be run. Changes that affect all of Mesa or revise the underlying structure of the CI system will take longer than usual, whereas small changes to drivers may finish quickly. In general, the pre-merge pipelines run quickly; for example, in the last week, the average run time for a pre-merge pipeline was 34 minutes with a margin of plus or minus 15 minutes on either side.

Conclusion

While a more traditional open source development model may rely on a hierarchy of maintainers to review and manually accept code changes trusting that developers will fix what they have broken, the power of pre-merge testing is to distribute and, to some extent, parallelize the process of contributing to the code base. Mesa CI allows developers to work continuously in tandem, relying on objective testing to protect against significant regressions, and avoiding the bottleneck of a manual merge process. Pre-merge testing in Mesa's CI system ensures that every contribution is rigorously tested before merging. Mesa CI works silently in the background, keeping everything running smoothly across various hardware configurations.

Here we have just scratched the surface of the Mesa CI system. A whole series of posts is planned to describe the complexity of how Mesa uses templates, containers, compression, and stored images at each stage of the CI pipeline. The Collabora team is always working to improve Mesa CI, continuously monitoring performance, adding new hardware, and improving tools to make the CI system easier to use and more efficient. We look forward to sharing more with you in the future.

PanVK support for Arm V10 GPUs

Taming the Panthor: OpenGL ES 3.1 conformance achieved on Mali-G610

Mesa 24.1 brings new hardware support for Arm and NVIDIA GPUs

PanVK support for Arm V10 GPUs

Taming the Panthor: OpenGL ES 3.1 conformance achieved on Mali-G610

Mesa 24.1 brings new hardware support for Arm and NVIDIA GPUs

Comments (0)

Add a Comment

Search the newsroom

Latest Blog Posts

Coccinelle for Rust progress report

25/06/2025

In collaboration with Inria, the French Institute for Research in Computer Science and Automation, Tathagata Roy shares the progress made…

Linux Media Summit 2025 recap

23/06/2025

Last month in Nice, active media developers came together for the annual Linux Media Summit to exchange insights and tackle ongoing challenges…

Constructor acquires, destructor releases

09/06/2025

In this final article based on Matt Godbolt's talk on making APIs easy to use and hard to misuse, I will discuss locking, an area where…

What if C++ had decades to learn?

21/05/2025

In this second article of a three-part series, I look at how Matt Godbolt uses modern C++ features to try to protect against misusing an…

Unleashing gst-python-ml: Python-powered ML analytics for GStreamer pipelines

12/05/2025

Powerful video analytics pipelines are easy to make when you're well-equipped. Combining GStreamer and Machine Learning frameworks are the…

Matt Godbolt sold me on Rust (by showing me C++)

06/05/2025

Gustavo Noronha helps break down C++ and shows how that knowledge can open up new possibilities with Rust.

About Collabora

Whether writing a line of code or shaping a longer-term strategic software development plan, we'll help you navigate the ever-evolving world of Open Source.

한국의 국기 한국어 버전의 Collabora.com 보기