kirelagin
kirelagin

Reputation: 13616

bpftrace doesn’t recognise a syscall argument as negative

Here is a simple bpftrace script:

#!/usr/bin/env bpftrace

tracepoint:syscalls:sys_enter_kill
{
  $tpid = args->pid;
  printf("%d %d %d\n", $tpid, $tpid < 0, $tpid >= 0);
}

It traces kill syscalls, prints the target PID and two additional values: whether it is negative, and whether it is non-negative.

Here is the output that I get:

# ./test.bt
Attaching 1 probe...
-1746 0 1
-2202 0 1
4160 0 1
4197 0 1
4197 0 1
-2202 0 1
-1746 0 1

Weirdly, both positive and negative pids appear to be positive for the comparison operator.

Just as a sanity, check, if I replace the assignment line with:

  $tpid = -10;

what I get is exactly what I expect:

# ./test.bt
Attaching 1 probe...
-10 1 0
-10 1 0
-10 1 0

What am I doing wrong?

Upvotes: 1

Views: 695

Answers (2)

pchaigno
pchaigno

Reputation: 13113

As you've discovered, bpftrace assigns a u64 type to your $tpid variable. Yet, according to the tracepoint format doc., args->pid should be of type pid_t, or int.

# cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_kill/format
name: sys_enter_kill
ID: 185
format:
    field:unsigned short common_type;   offset:0;   size:2; signed:0;
    field:unsigned char common_flags;   offset:2;   size:1; signed:0;
    field:unsigned char common_preempt_count;   offset:3;   size:1; signed:0;
    field:int common_pid;   offset:4;   size:4; signed:1;

    field:int __syscall_nr; offset:8;   size:4; signed:1;
    field:pid_t pid;    offset:16;  size:8; signed:0;
    field:int sig;  offset:24;  size:8; signed:0;

print fmt: "pid: 0x%08lx, sig: 0x%08lx", ((unsigned long)(REC->pid)), ((unsigned long)(REC->sig))

The bpftrace function that assigns this type is TracepointFormatParser::adjust_integer_types(). This change was introduced by commit 42ce08f to address issue #124.

For the above tracepoint description, bpftrace generates the following structure:

struct _tracepoint_syscalls_sys_enter_kill
{
  unsigned short common_type;
  unsigned char common_flags;
  unsigned char common_preempt_count;
  int common_pid;
  int __syscall_nr;
  u64 pid;
  s64 sig;
};

When it should likely generate:

struct _tracepoint_syscalls_sys_enter_kill
{
  unsigned short common_type;
  unsigned char common_flags;
  unsigned char common_preempt_count;
  int common_pid;
  int __syscall_nr;
  u32 pad1;
  pid_t pid;
  u32 pad2;
  int sig;
};

bpftrace seems to be confused by the size parameter that doesn't match the type in the above description. All syscall arguments get size 8 (on 64-bit at least), but that doesn't mean all 8 bytes are used. I think it would be worth opening an issue on bpftrace.

Upvotes: 2

kirelagin
kirelagin

Reputation: 13616

There is something strange going on with integer types in bpftrace (see #554, #772, #834 for details).

It seems that in my case arg->pids gets treated as a 64-bit value by default, while it is actually not. So the solution is to explicitly cast it:

  $tpid = (int32)args->pid;

And now it works as expected:

# bpftrace test.bt
Attaching 1 probe...
-2202 1 0
-1746 1 0
-2202 1 0
4160 0 1
4197 0 1

Upvotes: 0

Related Questions