Adrian Ratiu
May 06, 2019
Reading time:
In part 1 and part 2 of this series we looked at the inner-workings of the eBPF Virtual Machine, and in part 3 we studied the mainstream way of developing and using eBPF programs on top of the low-level VM mechanisms.
In this part we'll look at projects taking different approaches, attempting solutions to some of the unique problems faced by embedded Linux systems, like requiring very small custom OS images which can't fit a full BCC LLVM toolchain/python install or trying to avoid maintaining both a cross-compilation (native) toolchain for host machines and a cross-compiled target compiler toolchain, together with their associated build logic which is non-trivial even when using advanced build-systems like OpenEmbedded/Yocto.
In the mainstream way of running eBPF/BCC programs studied in part 3, portability is not such a big problem as on embedded devices: eBPF programs are compiled on the same machine on which they'll be loaded, using the already-running kernel with headers easily available via the distribution package managers. Embedded systems typically run different Linux distributions and on different processor architectures than developer computers, sometimes with heavily modified or upstream-divergent kernels, with big variations in build configs or tainted with binary-only modules.
The eBPF VM bytecode is generic, not machine specific, so moving eBPF bytecode from x86_64 to an ARM device won't cause too many problems once you get it compiled. Problems instead start when the bytecode prods at kernel functions and data structures which might be different or not exist in the target device's kernel, so at least the target device's kernel headers must be present on the host machine building the eBPF program bytecode. New features or eBPF instructions might also be added in later kernels which can make eBPF bytecode forward compatible, but not backward compatible between kernel versions (see eBPF features by kernel version). Attaching eBPF programs to stable kernel ABIs like tracepoints is recommended to ease portability in general.
Recently a significant effort has begun to increase portability of eBPF programs (compile-once run-everywhere) by embedding data-typing information in the LLVM generated eBPF object code by adding BTF (BTF Type Format) data. See this patch and this article for more information. This is significant because it touches all parts of the eBPF software stack (kernel VM and verifier, clang/LLVM compiler, BCC and so on) and can have a big payoff, allowing reuse of existing BCC tools without requiring special eBPF cross-compilation, installing LLVM on the embedded device or running BPFd. As of this writing, the compile-once run-everywhere BTF effort is still in early development stages and significant efforts are needed until it becomes usable. Maybe we'll create a blog post once it is ready.
BPFd was more of a proof-of-concept developed for Android devices which got abandoned in favor of running a full on-device BCC toolchain via the adeb package. If a device is sufficiently powerful to run Android and Java, it can probably also fit BCC/LLVM/python on it. Even though the implementation is somewhat incomplete (communication is done via the Android USB Debug Bridge or as a local process, not via a generic transport layer), the design is interesting and someone with sufficient time and resources could pick it up and merge it, continuing the PR work put on hold.
In a nutshell, BPFd is a daemon which runs on embedded devices acting as a remote procedure call (RPC) interface for the local kernel/libbpf. Python runs on a host machine calling BCC to compile/deploy eBPF bytecode and create/read from maps via BPFd. The main selling point of BPFd is that all the BCC infrastructure and scripts work without having to install BCC, LLVM or python on the target device, the BPFd binary being just around 100 kb with a libc dependency.
The ply project implements a high level domain-specific language very similar to BPFtrace (inspired by AWK and C), with the explicit purpose of keeping runtime dependencies to a minimum. It only depends on a modern libc (doesn't have to be the GNU libc) and shell (sh-compatible). Ply itself implements an eBPF compiler and needs to be built against the target device kernel headers, then deployed to the target device as a single binary library and shell wrapper.
To illustrate the point of ply, let's compare the BPFtrace example from part 3 to its ply equivalent.
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[pid, comm] = count(); }'
ply 'tracepoint:raw_syscalls/sys_enter { @[pid, comm] = count(); }'
Ply is still under heavy development (the recent v2.0 release was a complete rewrite), the language is not stable or documented except for a few examples, it's not as powerful as full BCC nor does it have BPFtrace feature parity yet, but it is still very useful for quick debugging a remote embedded device via ssh or serial console.
Gobpf and its merged subprojects (goebpf, gobpf-elf-loader), part of the bigger IOVisor project, provide Golang bindings for BCC. The eBPF kernel logic is still written in "restricted C" to be compiled by LLVM and only the standard python/lua userspace scripts are replaced by Go. What makes this project interesting for embedded devices is its eBPF elf loading module, which can be cross-compiled and run standalone on embedded devices to load and interact with the in-kernel eBPF programs.
It is important to note that the go loader can be written to be generic (we'll see this in action shortly), so it can load and run any eBPF bytecode and be re-used locally for multiple different tracing sessions.
Working with gobpf is painful mostly due to the lack of documentation. The best "documentation" at this time is the tcptracer source which is quite complex (they use kprobes without depending on a specific kernel version!), but a lot can be learned from it. Gobpf itself is also a work in progress: while the elf loader is fairly complete and supports loading eBPF ELF objects with sockets, (k|u)probes, tracepoints, perf events and so on, the bcc go bindings module doesn't easily support all those features yet. For example, even though you can write a socket filter ebpf program, compile and load it into the kernel, you still can't interact with the eBPF from the go userspace as easily as from BCC's python whose API is much more mature and user-friendly. In any case, gobpf is still in better shape than other projects with similar goals.
Let's study a simple example to illustrate how gobpf works. First we'll run it on a local x86_64 machine, then cross-compile and run it on 32bit ARMv7 board like the popular Beaglebone or Raspberry Pi. We have the following filesystem tree:
$ find . -type f ./src/open-example.go ./src/open-example.c ./Makefile
open-example.go: This is the eBPF ELF loader built on top of gobpf/elf. It takes the compiled "restricted C" ELF object as argument, loads it in the kernel and lets it run until the loader process is killed, at which point the kernel automatically unloads the eBPF logic. We intentionally keep the loader simple and generic (it loads any probes it finds in an object file) so it can be reused. More complex logic can be added here by using the gobpf bindings module.
package main import ( "fmt" "os" "os/signal" "github.com/iovisor/gobpf/elf" ) func main() { mod := elf.NewModule(os.Args[1]) err := mod.Load(nil); if err != nil { fmt.Fprintf(os.Stderr, "Error loading '%s' ebpf object: %v\n", os.Args[1], err) os.Exit(1) } err = mod.EnableKprobes(0) if err != nil { fmt.Fprintf(os.Stderr, "Error loading kprobes: %v\n", err) os.Exit(1) } sig := make(chan os.Signal, 1) signal.Notify(sig, os.Interrupt, os.Kill)
open-example.c: This is the "restricted C" source which the above loader inserts in the kernel. It hooks do_sys_open and prints to the trace ringbuffer the process command, PID, core number, the opened filename and a timestamp according to the ftrace format, (see section "Output format"). The filename to be open is passed as the second arg of the do_sys_open call, which can be accessed from the context structure representing the CPU register at function entry.
#include <uapi/linux/bpf.h> #include <uapi/linux/ptrace.h> #include <bpf/bpf_helpers.h> SEC("kprobe/do_sys_open") int kprobe__do_sys_open(struct pt_regs *ctx) { char file_name[256]; bpf_probe_read(file_name, sizeof(file_name), PT_REGS_PARM2(ctx)); char fmt[] = "file %s\n"; bpf_trace_printk(fmt, sizeof(fmt), &file_name); return 0; } char _license[] SEC("license") = "GPL"; __u32 _version SEC("version") = 0xFFFFFFFE;
In the above code we define specific "SEC" sections so the gobpf loader knows where to look or what to load. In our case the sections are kprobe, license and version. The special 0xFFFFFFFE value tells the loader that this eBPF program is compatible with any kernel version because the chances of the open syscall changing are close to 0 because it will break userspace.
Makefile: This is the build logic for the above two files. Notice how we added "arch/x86/..." to the include path; on ARM it will be "arch/arm/...".
SHELL=/bin/bash -o pipefail LINUX_SRC_ROOT="/home/adi/workspace/linux" FILENAME="open-example" ebpf-build: clean go-build clang \ -D__KERNEL__ -fno-stack-protector -Wno-int-conversion \ -O2 -emit-llvm -c "src/${FILENAME}.c" \ -I ${LINUX_SRC_ROOT}/include \ -I ${LINUX_SRC_ROOT}/tools/testing/selftests \ -I ${LINUX_SRC_ROOT}/arch/x86/include \ -o - | llc -march=bpf -filetype=obj -o "${FILENAME}.o" go-build: go build -o ${FILENAME} src/${FILENAME}.go clean: rm -f ${FILENAME}*
Running the above makefile produces two new files in the current directory:
The "open-example" and "open-example.o" ELF binaries can be further combined into one; the loader can include the eBPF binary as an asset, or it can store it directly as a byte array in its source code like tcptracer does. Doing this is, however, outside the scope of our article.
Running the example yields the following output (see the "Output format" section in the ftrace documentation):
# (./open-example open-example.o &) && cat /sys/kernel/debug/tracing/trace_pipe electron-17494 [007] ...3 163158.937350: 0: file /proc/self/maps systemd-1 [005] ...3 163160.120796: 0: file /proc/29261/cgroup emacs-596 [006] ...3 163163.501746: 0: file /home/adi/ (...)
Reusing the terminology we defined in part 3 of this article series, our eBPF program has the following components:
Now to cross compile our example for 32bit ARMv7. Based on what kernel version your ARM device is running:
The new makefile tells LLVM/Clang we're targeting an ARMv7 device for our eBPF bytecode, to use the 32bit eBPF VM subregister address mode so the VM can correctly access the native processor supplied 32-bit addressed memory (remember from part 2 how all eBPF VM registers are 64bit wide by default in all implementations), sets the proper include paths, then instructs the Go compiler to use the correct cross-compile settings. A pre-existing cross-compiler toolchain is required before running this makefile and is pointed by to the CC var.
SHELL=/bin/bash -o pipefail LINUX_SRC_ROOT="/home/adi/workspace/linux" FILENAME="open-example" ebpf-build: clean go-build clang \ --target=armv7a-linux-gnueabihf \ -D__KERNEL__ -fno-stack-protector -Wno-int-conversion \ -O2 -emit-llvm -c "src/${FILENAME}.c" \ -I ${LINUX_SRC_ROOT}/include \ -I ${LINUX_SRC_ROOT}/tools/testing/selftests \ -I ${LINUX_SRC_ROOT}/arch/arm/include \ -o - | llc -march=bpf -filetype=obj -o "${FILENAME}.o" go-build: GOOS=linux GOARCH=arm CGO_ENABLED=1 CC=arm-linux-gnueabihf-gcc \ go build -o ${FILENAME} src/${FILENAME}.go clean: rm -f ${FILENAME}*
Run the new makefile and verify that the produced binaries have been cross-compiled correctly:
[adi@iwork]$ file open-example* open-example: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter (...), stripped open-example.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), not stripped
Then copy the loader and bytecode to your device and run it using the same command as above on the x86_64 host. Remember, the loader can be reused for running different traces by just modifying and recompiling the C eBPF code.
[root@ionelpi adi]# (./open-example open-example.o &) && cat /sys/kernel/debug/tracing/trace_pipe ls-380 [001] d..2 203.410986: 0: file /etc/ld-musl-armhf.path ls-380 [001] d..2 203.411064: 0: file /usr/lib/libcap.so.2 ls-380 [001] d..2 203.411922: 0: file / zcat-397 [002] d..2 432.676010: 0: file /etc/ld-musl-armhf.path zcat-397 [002] d..2 432.676237: 0: file /usr/lib/libtinfo.so.5 zcat-397 [002] d..2 432.679431: 0: file /usr/bin/zcat gzip-397 [002] d..2 432.693428: 0: file /proc/ gzip-397 [002] d..2 432.693633: 0: file config.gz
Since the loader and bytecode together are only ~ 2 mb, this is quite a nice way of running eBPF on embedded devices without requiring a full BCC/LLVM installation.
In this fourth part of the series we took a look at projects in the eBPF ecosystem which can be used for running eBPF on small embedded devices. Unfortunately, working with these projects at this point in time is hard: They are abandoned or lacking man-power, in early development when everything changes or lacking essential documentation requiring users to deep-dive into source code and figure it out themselves. As we have seen, the gobpf project is the most capable as a BCC/python replacement and ply is also a promising BPFtrace alternative with minimal footprints. With some more work put into these projects to ease users lives, the power of eBPF can be used on resource constrained embedded devices without having to port / install the entire BCC/LLVM/python/Hover stacks.
Continue reading (An eBPF overview, part 5: Tracing user processes)…
15/01/2025
With VirGL, Venus, and vDRM, virglrenderer offers three different approaches to obtain access to accelerated GFX in a virtual machine. Here…
19/12/2024
In the world of deep learning optimization, two powerful tools stand out: torch.compile, PyTorch’s just-in-time (JIT) compiler, and NVIDIA’s…
08/10/2024
Having multiple developers work on pre-merge testing distributes the process and ensures that every contribution is rigorously tested before…
15/08/2024
After rigorous debugging, a new unit testing framework was added to the backend compiler for NVK. This is a walkthrough of the steps taken…
01/08/2024
We're reflecting on the steps taken as we continually seek to improve Linux kernel integration. This will include more detail about the…
27/06/2024
With each board running a mainline-first Linux software stack and tested in a CI loop with the LAVA test framework, the Farm showcased Collabora's…
Comments (2)
SamSamy:
Jun 08, 2019 at 03:01 PM
Hi Adrian,
Thanks for writing a detailed series on eBPF. It was very informative.
For running on the embedded platform, why can't we just use the native C language option using libbpf to load the elf file rather than using "Level one" you mentioned in part 3? Usually embedded platform builds come with its own build framework(buildroot, openwrt, yocto, etc) to cross compile from a host x86 machine which we can compile any program with proper kernel header files and dependencies of the embedded platforms.
Thanks
Reply to this comment
Reply to this comment
Adrian:
Jun 10, 2019 at 07:11 PM
Hi SamSammy,
Thank you for the excellent question! Indeed, at least in theory, the native compilation route can be used exclusively as you suggest to avoid creating and maintaining these additional projects, domain-specific languages, communication protocols and abstraction layers in general. In my opinion however, in practice there are factors which drive the ecosystem in the current, different direction:
1. Yocto, openwrt, buildroot (and so on) are niche projects and even though they are very powerful and awesome, they can be complex/hard to understand and learn. Relatively few developers use them outside the embedded Linux space and we have to keep in mind that eBPF is mainly developed and maintained by people/companies outside embedded who also spend their efforts making eBPF appeal to the widest possible audience of users.
2. eBPF is only supported via the LLVM toolchain & the clang compiler frontend (this might change with GCC 10, but the majority of eBPF users will still continue to use LLVM) while most distributions and meta-build-systems are GCC-based. It is true that there are sub-projects like meta-clang for Yocto which can build using LLVM/Clang, but keep in mind that by using such a thing we're talking of a niche inside another niche and some projects like the linux kernel which are important in embedded distributions simply can't be built yet with clang easily.
3. There is this general trend in programming by which developers turn away from C towards higher level languages like python (which the eBPF Compiler Collection also uses).
I think you see where I'm going with this: intersecting the group of LLVM/eBPF interested users with meta-build-system users with developers interested in writing C instead of something higher level will give you a very small group of people. Now intersect this small group with those who have time/resources to develop a solution in this direction and we get almost nobody, so I think it's just uncharted territory waiting for a pioneer.
Adrian
Reply to this comment
Reply to this comment
Add a Comment