Alexandros Frantzis
October 18, 2018
Reading time:
A few times in the recent past I've been in the unfortunate position of using a prominent Free and Open Source Software (FOSS) program or library, and running into issues of such fundamental nature that made me wonder how those issues even made it into a release.
In all cases, the answer came quickly when I realized that, invariably, the project involved either didn't have a test suite, or, if it did have one, it was not adequately comprehensive.
I am using the term comprehensive in a very practical, non extreme way. I understand that it's often not feasible to test every possible scenario and interaction, but, at the very least, a decent test suite should ensure that under typical circumstances the code delivers all the functionality it promises to.
For projects of any value and significance, having such a comprehensive automated test suite is nowadays considered a standard software engineering practice. Why, then, don't we see more prominent FOSS projects employing this practice, or, when they do, why is it often employed poorly?
In this post I will highlight some of the reasons that I believe play a role in the low adoption of proper automated testing in FOSS projects, and argue why these reasons may be misguided. I will focus on topics that are especially relevant from a FOSS perspective, omitting considerations, which, although important, are not particular to FOSS.
My hope is that by shedding some light on this topic, more FOSS projects will consider employing an automated test suite.
As you can imagine, I am a strong proponent of automating testing, but this doesn't mean I consider it a silver bullet. I do believe, however, that it is an indispensable tool in the software engineering toolbox, which should only be forsaken after careful consideration.
Most FOSS projects, at least those not supported by some commercial entity, don't come with any warranty; it's even stated in the various licenses! The lack of any formal obligations makes it relatively inexpensive, both in terms of time and money, to have the occasional bug in the codebase. This means that there are fewer incentives for the developer to spend extra resources to try to safeguard against bugs. When bugs come up, the developers can decide at their own leisure if and when to fix them and when to release the fixed version. Easy!
At first sight, this may seem like a reasonably pragmatic attitude to have. After all, if fixing bugs is so cheap, is it worth spending extra resources trying to prevent them?
Unfortunately, bugs are only cheap for the developer, not for the users who may depend on the project for important tasks. Users expect the code to work properly and can get frustrated or disappointed if this is not the case, regardless of whether there is any formal warranty. This is even more pronounced when security concerns are involved, for which the cost to users can be devastating.
Of course, lack of formal obligations doesn't mean that there is no driver for quality in FOSS projects. On the contrary, there is an exceptionally strong driver: professional pride. In FOSS projects the developers are in the spotlight and no (decent) developer wants to be associated with a low quality, bug infested codebase. It's just that, due to the mentality stated above, in many FOSS projects the trade-offs developers make seem to favor a reactive rather than proactive attitude.
One of the development practices FOSS projects employ ardently is code reviews. Code reviews happen naturally in FOSS projects, even in small ones, since most contributors don't have commit access to the code repository and the original author has to approve any contributions. In larger projects there are often more structured procedures which involve sending patches to a mailing list or to a dedicated reviewing platform. Unfortunately, in some projects the trust on code reviews is so great, that other practices, like automated testing, are forsaken.
There is no question that code reviews are one of the best ways to maintain and improve the quality of a codebase. They can help ensure that code is designed properly, it is aligned with the overall architecture and furthers the long term goals of the project. They also help catch bugs, but only some of them, some of the time!
The main problem with code reviews is that we, the reviewers, are only human. We humans are great at creative thought, but we are also great at overlooking things, occasionally filling in the gaps with our own unicorns-and-rainbows inspired reality. Another reason is that we tend to focus more on the code changes at a local level, and less on how the code changes affect the system as a whole. This is not an inherent problem with the process itself but rather a limitation of humans performing the process. When a codebase gets large enough, it's difficult for our brains to keep all the possible states and code paths in mind and check them mentally, even in a codebase that is properly designed.
In theory, the problem of human limitations is offset by the open nature of the code. We even have the so called Linus's law which states that "given enough eyeballs, all bugs are shallow". Note the clever use of the indeterminate term "enough". How many are enough? How about the qualitative aspects of the "eyeballs"?
The reality is that most contributions to big, successful FOSS projects are reviewed on average by a couple of people. Some projects are better, most are worse, but in no case does being FOSS magically lead to a large number of reviewers tirelessly checking code contributions. This limit in the number of reviewers also limits the extent to which code reviews can stand as the only process to ensure quality.
In order to try out a development process in a project, developers first need to learn about it and be convinced that it will be beneficial. Although there are many resources, like books and articles, arguing in favor of automated tests, the main driver for trying new processes is still learning about them from more experienced developers when working on a project. In the FOSS world this also takes the form of studying what other projects, especially the high-profile ones, are doing.
Since comprehensive automated testing is not the norm in FOSS, this creates a negative network effect. Why should you bother doing automated tests if the high profile projects, which you consider to be role models, don't do it properly or at all?
Thankfully, the culture is beginning to shift, especially in projects using technologies in which automated testing is part of the culture of the technologies themselves. Unfortunately, many of the system-level and middleware FOSS projects are still living in the non automated test world.
Tests as an afterthought is not a situation particular to FOSS projects, but it is especially relevant to them since the way they spring and grow can disincentivize the early writing of tests.
Some FOSS projects start as small projects to scratch an itch, without any plans for significant growth or adoption, so the incentives to have tests at this stage are limited.
In addition, many projects, even the ones that start with more lofty adoption goals, follow a "release early, release often" mentality. This mentality has some benefits, but at the early stages also carries the risk of placing the focus exclusively on making the project as relevant to the public as possible, as quickly as possible. From such a perspective, spending the probably limited resources on tests instead of features seems like a bad use of developer time.
As the project grows and becomes more complex, however, more and more opportunities for bugs arise. At this point, some projects realize that adding a test suite would be beneficial for maintaining quality in the long term. Unfortunately, for many projects, it's already too late. The code by now has become test-unfriendly and significant effort is needed to change it.
The final effect is that many projects remain without an automated test suite, or, in the best case, with a poor one.
Automated testing delivers the most value if it is combined with a CI service that runs the tests automatically for each commit or merge proposal. Until recently, access to such services was difficult to get for a reasonably low effort and cost. Developers either had to set up and host CI themselves, or pay for a commercial service, thus requiring resources which unsponsored FOSS projects were unlikely to be able to afford.
Nowadays, it's far easier to find and use free CI services, with most major code hosting platforms supporting them. Hopefully, with time, this reason will completely cease being a factor in the lack of automated testing adoption.
The FOSS movement originated from the hacker culture and still has strong ties to it. In the minds of some, the processes around software testing are too enterprise-y, too 9-to-5, perceived as completely contrary to the creative and playful nature of hacking.
My argument against this line of thought is that the hacker values technical excellence very highly, and, automated testing, as a tool that helps achieve such excellence, can not be inconsistent with the hacker way.
Some pseudo-hackers may also argue that their skills are so refined that their code doesn't require testing. When we are talking about a codebase of any significant size, I consider this attitude to be a sign of inexperience and immaturity rather than a testament of superior skills.
I hope this post will serve as a good starting point for a discussion about the reasons which discourage FOSS projects from adopting a comprehensive automated test suite. Identifying both valid concerns and misconceptions is the first step in convincing both fledging and mature FOSS projects to embrace automated testing, which will hopefully lead to an improvement in the overall quality of FOSS.
Visit Alexandros' blog.
19/12/2024
In the world of deep learning optimization, two powerful tools stand out: torch.compile, PyTorch’s just-in-time (JIT) compiler, and NVIDIA’s…
08/10/2024
Having multiple developers work on pre-merge testing distributes the process and ensures that every contribution is rigorously tested before…
15/08/2024
After rigorous debugging, a new unit testing framework was added to the backend compiler for NVK. This is a walkthrough of the steps taken…
01/08/2024
We're reflecting on the steps taken as we continually seek to improve Linux kernel integration. This will include more detail about the…
27/06/2024
With each board running a mainline-first Linux software stack and tested in a CI loop with the LAVA test framework, the Farm showcased Collabora's…
26/06/2024
WirePlumber 0.5 arrived recently with many new and essential features including the Smart Filter Policy, enabling audio filters to automatically…
Comments (11)
jack:
Dec 13, 2018 at 11:06 AM
Nice idea
Thanks for sharing with us your idea
Reply to this comment
Reply to this comment
Dimitrios:
Jan 03, 2019 at 03:26 PM
7. Automation is hard and not fun
Designing proper test cases for a feature can be harder than implementing the actual feature. For one, the test scenarios can themselves be incomplete or incorrect leading to a false sense of security. Second, testing in isolation is often hard because it requires an architecture that can support that. Architecture is a thing that encompass a project in its entirety and needs to be provisioned and some kind of authority that actively enforces it.
Reply to this comment
Reply to this comment
john richard:
Jan 09, 2019 at 12:20 PM
great post.. thanks for sharing this.
Reply to this comment
Reply to this comment
Resolitus:
Jan 18, 2019 at 09:12 AM
Nice Post
Reply to this comment
Reply to this comment
ColdFusion Development Company:
Feb 20, 2019 at 09:10 AM
Pretty good post, Thank you.
Reply to this comment
Reply to this comment
RajputSahab:
Apr 19, 2019 at 06:42 AM
great post.. thanks for sharing this.
Reply to this comment
Reply to this comment
Joesph:
Aug 30, 2019 at 01:10 PM
Thanks for sharing a great post.
Reply to this comment
Reply to this comment
Urvashi Chopra:
Sep 14, 2019 at 05:46 PM
Thanks for sharing a great post.
Reply to this comment
Reply to this comment
Medatdoorstep0:
Sep 27, 2019 at 10:53 AM
Great Article...
Keep it sharing
Reply to this comment
Reply to this comment
SilverLogic:
May 28, 2020 at 08:19 PM
Interesting!
Reply to this comment
Reply to this comment
Susan Robinson :
Mar 09, 2024 at 06:12 AM
Intriguing insights on the challenges surrounding automated testing adoption in FOSS
Reply to this comment
Reply to this comment
Add a Comment