Reputation: 1
I am developing eBPF programming. Sometimes I cannot get the program name using execve
, but I can use execv
and syscall (SYS_execve,...)
. The specific code is as follows:
static u32 ebpf_getppid(void)
{
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent = (struct task_struct *)BPF_CORE_READ(task, real_parent);
return BPF_CORE_READ(parent, tgid);
}
SEC("tp/syscalls/sys_enter_execve")
int tracepoint__syscalls__sys_enter_execve(struct trace_event_raw_sys_enter *ctx)
{
struct epm_command command = {};
const char *filename = (const char *)BPF_CORE_READ(ctx, args[0]);
const unsigned long *argv_ptr = (const unsigned long *)BPF_CORE_READ(ctx, args[1]);
const unsigned long *envp_ptr = (const unsigned long *)BPF_CORE_READ(ctx, args[2]);
char temp[128] = {0};
for(int i = 0; i < 4; i++){
bpf_printk("args[%d]: 0x%lx\n", i, BPF_CORE_READ(ctx, args[i]));
}
command.process_id = ebpf_getppid();
command.timestamp = bpf_ktime_get_ns();
bpf_get_current_comm(&command.process_name, sizeof(command.process_name));
bpf_probe_read_str(&command.call_prog_name, sizeof(command.call_prog_name), filename);
bpf_printk("Parent Process name: %s\n", command.process_name);
bpf_printk("Call Process name: %s\n", command.call_prog_name);
for(int i = 0; i < 64; i++) {
unsigned long arg_ptr = 0;
__builtin_memset(temp, 0, sizeof(temp));
bpf_probe_read_str(&arg_ptr, sizeof(arg_ptr), &argv_ptr[i]);
if(arg_ptr == 0) {
break;
}
bpf_probe_read_str(temp, sizeof(temp), (void *)arg_ptr);
bpf_printk("argv[%d]: %s\n", i, temp);
}
for(int i = 0; i < 64; i++) {
unsigned long env_ptr = 0;
__builtin_memset(temp, 0, sizeof(temp));
bpf_probe_read_str(&env_ptr, sizeof(env_ptr), &envp_ptr[i]);
if(env_ptr == 0) {
break;
}
bpf_probe_read_str(temp, sizeof(temp), (void *)env_ptr);
bpf_printk("envp[%d]: %s\n", i, temp);
}
bpf_map_update_elem(&epm_execve_map, &command.process_id, &command, BPF_ANY);
return 0;
}
int main() {
char *args[] = {"/usr/bin/ls", "-l", NULL, NULL};
char *envp[] = {NULL};
execve("/usr/bin/ls", args, envp);
return 0;
}
int main() {
char *args[] = {"/usr/bin/ls", "-l", NULL, NULL};
char *envp[] = {NULL};
printf("args addr: %p\n", args);
printf("envp addr: %p\n", envp);
execve("/usr/bin/ls", args, envp);
return 0;
}
The difference between the two application-level codes is that printf
is added to print args
and envp
. I would like to ask what is the specific reason for this?
I hope to get the correct answer to the above-described question
Upvotes: 0
Views: 39
Reputation: 760
The behavior you’re seeing is expected and relates to how static strings are handled in memory. When you define args
and envp
as static arrays (e.g., char *args[] = {"/usr/bin/ls", "-l", NULL, NULL}
), the compiler embeds these strings into the binary, but they aren’t loaded into memory until they’re accessed. In your eBPF program, the tracepoint__syscalls__sys_enter_execve
runs before this access happens, so bpf_probe_read_str
may fail to read the data, resulting in empty output.
When you add printf("args addr: %p\n", args)
, it forces the program to access these variables, triggering the kernel to fault the memory page containing the strings into RAM. Since memory is loaded in pages (not individual variables), this makes the data available by the time your eBPF probe runs. This explains why adding printf "fixes" the issue.
This is a known behavior in eBPF tracing. As noted in this GitHub issue comment:
the data you're using isn't in memory yet. These static strings are compiled in and are not actually faulted into memory until they're accessed. The access won't happen until its read, which is after your bpftrace probe ran. BPF won't pull the data in so you get an EFAULT/-14.
By printing the values or just a random print of a constant string you pull the small amount of data into memory (as it goes by page, not by var) and then it works
For a deeper dive, see this blog post which explores a similar case.
Upvotes: 2