Pre-Requisites
System calls, eBPF, C, basics of low-level programming.
Introduction
eBPF (Extended Berkeley Packet Filter) is a technology that allows users to run custom programs within the kernel. BPF / or cBPF (classic BPF), the predecessor of eBPF provided a simple and efficient way to filter packets based on predefined rules. eBPF programs offer enhanced safety, portability, and maintainability as compared to kernel modules. There are several high-level methods available for working with eBPF programs, such as Cilium’s go library, bpftrace, libbpf, etc.
Note
: This post requires the reader to have a basic understanding ofeBPF
. If you’re not familiar with it, this post byebpf.io
is a great read.
Objectives
You must already be familiar with the famous tool strace
. We’ll be developing something similar to that using eBPF. For example,
1 | ./beetrace /bin/ls |
Concepts
Before we start writing our tool, we need to familiarize ourselves with some key concepts.
Tracepoints
: They are instrumentation points placed in various parts of the Linux kernel code. They provide a way to hook into specific events or code paths within the kernel without modifying the kernel source code. The events available of tracing can be found at/sys/kernel/debug/tracing/events
.The
SEC
macro: It creates a new section with the name as the name of the tracepoint within the target ELF. For example,SEC(tracepoint/raw_syscalls/sys_enter)
creates a new section with this name. The sections can be viewed using readelf.
1 | readelf -s --wide somefile.o |
Maps
: They are shared data structures that can be accessed from both eBPF programs and applications running in the userspace.
Writing the eBPF programs
We won’t be writing a comprehensive tool for tracing all the system calls due to the vast number of system calls present in the Linux kernel. Instead, we’ll focus on tracing a few common system calls. To achieve this, we’ll write two types of programs: eBPF programs and a loader (which loads the BPF objects into the kernel and attaches them).
Let’s start by creating a few data structures to set things up.
1 | // controller.h |
The loader will read the path of the ELF file to be traced, which will be provided by the user as a command line argument. Then, the loader will spawn a child process and use execve
to run the program specified in the command line argument.
The parent process will handle all the necessary setup for loading and attaching the eBPF programs. It also performs the crucial task of sending the child process’s ID to the eBPF program via the BPF hashmap.
1 | // loader.c |
To trace system calls, we need to write eBPF programs that are triggered by the tracepoint/raw_syscalls/sys_enter
and tracepoint/raw_syscalls/sys_exit
tracepoints. These tracepoints provide access to the system call number and arguments. For a given system call, the tracepoint/raw_syscalls/sys_enter
tracepoint is always triggered before the tracepoint/raw_syscalls/sys_exit
tracepoint. We can use the former to retrieve the system call arguments and the latter to obtain the return value. Additionally, we will use eBPF maps to share information between the user-space program and our eBPF programs. Specifically, we will use two types of eBPF maps: hashmaps and ring buffers.
1 | // controller.c |
Having defined the maps, we’re ready to write the programs. Let’s start by writing the program for the tracepoint tracepoint/raw_syscalls/sys_enter
.
1 | // loader.c |
Similarly, we can write the program for reading the return value and sending it to userland.
1 | // controller.c |
Let’s now finalize the functionality for the parent process in the loader program. Before doing that, we need to understand how some key functions work.
bpf_object__open
: Creates a bpf_object by opening the BPF ELF object file pointed to by the passed path and loading it into memory.
1 | LIBBPF_API struct bpf_object *bpf_object__open(const char *path); |
bpf_object__load
: Loads BPF object into kernel.
1 | LIBBPF_API int bpf_object__load(struct bpf_object *obj); |
bpf_object__find_program_by_name
: Returns a pointer to a valid BPF program.
1 | LIBBPF_API struct bpf_program *bpf_object__find_program_by_name(const struct bpf_object *obj,const char *name); |
bpf_program__attach
: Function for attaching a BPF program based on auto-detection of program type, attach type, and extra paremeters, where applicable.1
LIBBPF_API struct bpf_link *bpf_program__attach(const struct bpf_program *prog);
bpf_map__update_elem
: Allows to insert or update value in BPF map that corresponds to provided key.1
LIBBPF_API int bpf_map__update_elem(const struct bpf_map *map,const void *key, size_t key_sz, const void *value, size_t value_sz, __u64 flags);
bpf_object__find_map_fd_by_name
: Given a BPF map name, it returns a file descriptor to it.1
LIBBPF_API int bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name);
ring_buffer__new
: Returns a pointer to the ring buffer.1
LIBBPF_API struct ring_buffer *ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx, const struct ring_buffer_opts *opts);
The second argument must be a function which can be used for handling the data received from the ring buffer.
1 | bool initialized = false; |
It prints the name and arguments of the system calls.
ring_buffer__consume
: It processes the available events in the ring buffer.
1 | LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb); |
We now have everything needed to write the loader.
1 | // loader.c |
And, here are the eBPF programs. The C code will be compiled into a single object file.
1 | // controller.c |
Before compiling, we can create a test program which will be traced by our tool.
1 |
|
The following Makefile can be used to compile all the stuff.
1 | compile: |
Now let’s execute the loader with root privileges.
1 | sudo ./beetrace ./test |
The entire code can be found in this GitHub repository.
References: