Reputation: 1063
I'm trying to dynamically find the number of function called and returned of a program at runtime in x86_64 (intel syntax).
To do it I'm using ptrace (without the PTRACE_SYSCALL), and I'm checking RIP register (which contains the next instruction address) and I'm checking his opcode. I know that a function CALL can be found if LSB is equal to 0xE8 (according to Intel documentation, or http://icube-avr.unistra.fr/fr/images/4/41/253666.pdf page 105).
I found each instruction on http://ref.x86asm.net/coder64.html, So in my program, each time I found 0xE8, 0x9A, 0xF1, etc... I found a function entry (CALL or INT instruction), and if it's a 0xC2, 0XC3, etc... it's a function leave (RET instruction).
The goal is to find it on every program at runtime, I can't have access to the test program's compilation, instrumentation or use gcc's magic tools.
I made a little program who can be compiled with gcc -Wall -Wextra your_file.c
and be launched by typing ./a.out a_program
.
Here is my code:
#include <sys/ptrace.h>
#include <sys/signal.h>
#include <sys/wait.h>
#include <sys/user.h>
#include <stdint.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
typedef struct user_regs_struct reg_t;
static int8_t increase(pid_t pid, int32_t *status)
{
if (WIFEXITED(*status) || WIFSIGNALED(*status))
return (-1);
if (WIFSTOPPED(*status) && (WSTOPSIG(*status) == SIGINT))
return (-1);
if (ptrace(PTRACE_SINGLESTEP, pid, NULL, NULL) == -1)
return (-1);
return (0);
}
int main(int argc, char *argv[])
{
size_t pid = fork();
long address_rip;
uint16_t call = 0;
uint16_t ret = 0;
int32_t status;
reg_t regs;
if (!pid) {
if ((status = ptrace(PTRACE_TRACEME, 0, NULL, NULL)) == -1)
return (1);
kill(getpid(), SIGSTOP);
execvp(argv[1], argv + 1);
} else {
while (42) {
waitpid(pid, &status, 0);
ptrace(PTRACE_GETREGS, pid, NULL, ®s);
address_rip = ptrace(PTRACE_PEEKDATA, pid, regs.rip, NULL);
address_rip &= 0xFFFF;
if ((address_rip & 0x00FF) == 0xC2 || (address_rip & 0x00FF) == 0xC3 ||
(address_rip & 0x00FF) == 0xCA || (address_rip & 0x00FF) == 0xCB ||
(address_rip & 0x00FF) == 0xCF)
ret += 1;
else if ((address_rip & 0x00FF) == 0xE8 || (address_rip & 0x00FF) == 0xF1 ||
(address_rip & 0x00FF) == 0x9A || (address_rip & 0x00FF) == 0xCC ||
(address_rip & 0x00FF) == 0xCD || (address_rip & 0x00FF) == 0xCF)
call += 1;
if (increase(pid, &status) == -1) {
printf("call: %i\tret: %i\n", call, ret);
return (0);
}
}
}
return (0);
}
When I ran it with a_program
(it's a custom program who simply enter in some local function and do some write syscall, the goal is just to trace the number of entered / left function of this program), No error occur, it's work fine, BUT I don't have the same number of CALL and RET.
exemple:
user> ./a.out basic_program
call: 636 ret: 651
(The large number of call and ret is caused by LibC who goes into a lot of function before start your program, see Parsing Call and Ret with ptrace.)
Actually, it's like my program goes into more return than function call, but I found that 0xFF instruction is used for CALL or CALLF in (r/m64 or r/m16/m32), but also for other instruction like DEC, INC or JMP (who are very common instruction).
So, how can I differentiate it? according to http://ref.x86asm.net/coder64.html with the "opcode fields", but how can I found it?
If I add 0xFF into my condition:
else if ((address_rip & 0x00FF) == 0xE8 || (address_rip & 0x00FF) == 0xF1 ||
(address_rip & 0x00FF) == 0x9A || (address_rip & 0x00FF) == 0xCC ||
(address_rip & 0x00FF) == 0xCD || (address_rip & 0x00FF) == 0xCF ||
(address_rip & 0x00FF) == 0xFF)
call += 1;
If I launch it:
user> ./a.out basic_program
call: 1152 ret: 651
It seems normal for me, because it's count each JMP, DEC or INC, so I need to make a distinction between each 0xFF instruction. I tried to do like that:
else if ((address_rip & 0x00FF) == 0xE8 || (address_rip & 0x00FF) == 0xF1 ||
(address_rip & 0x00FF) == 0x9A || (address_rip & 0x00FF) == 0xCC ||
(address_rip & 0x00FF) == 0xCD || (address_rip & 0x00FF) == 0xCF ||
((address_rip & 0x00FF) == 0xFF && ((address_rip & 0x0F00) == 0X02 ||
(address_rip & 0X0F00) == 0X03)))
call += 1;
But it gave me the same result. Am I wrong somewhere? How can I find the same number of call and ret?
Upvotes: 5
Views: 1587
Reputation: 39316
I would personally run the tracing one instruction "late", retaining rip
and rsp
from the previous step. For simplicity, let's say curr_rip
and curr_rsp
are the rip
and rsp
registers obtained from the most recent PTRACE_GETREGS
, and prev_rip
and prev_rsp
from the previous one.
If (curr_rip < prev_rip || curr_rip > prev_rip + 16)
, then the instruction pointer either went backwards, or forwards by more than the length of the longest valid instruction. If so, then:
If (curr_rsp > prev_rsp)
, the last instruction was a ret
of some kind, because data was also popped off the stack.
If (curr_rsp < prev_rsp)
, the last instruction was a call
of some kind, because data was also pushed to the stack.
If (curr_rsp == prev_rsp)
, the instruction was some sort of a jump; either unconditional jump, or a branch.
In other words, you only need to inspect the instruction (of curr_rip - prev_rip
bytes, which is between 1 and 16, inclusive) starting at prev_rip
, when (curr_rsp != prev_rsp && curr_rip > prev_rip && curr_rip <= prev_rip + 16
). For this, I'd use Intel XED, but you are free to implement your own call/ret instruction recognizer, of course.
Upvotes: 2
Reputation: 92986
Here is an example for how to program this. Note that as an x86 instruction can be up to 16 bytes long, 16 bytes must be peeked to be sure to get a complete instruction. As each peek reads 8 bytes, this means that you need to peek twice, once at regs.rip
and once 8 byte later:
peek1 = ptrace(PTRACE_PEEKDATA, pid, regs.rip, NULL);
peek2 = ptrace(PTRACE_PEEKDATA, pid, regs.rip + sizeof(long), NULL);
Note that this code glosses over a lot of details about how prefixes are handled and detects a bunch of invalid instructions as function calls. Note further that the code needs to be changed to also incorporate some more CALL instructions and to remove the detection of REX prefixes if you want to use it for 32 bit code:
int iscall(long peek1, long peek2)
{
union {
long longs[2];
unsigned char bytes[16];
} data;
int opcode, reg;
size_t offset;
/* turn peeked longs into bytes */
data.longs[0] = peek1;
data.longs[1] = peek2;
/* ignore relevant prefixes */
for (offset = 0; offset < sizeof data.bytes &&
((data.bytes[offset] & 0xe7) == 0x26 /* cs, ds, ss, es override */
|| (data.bytes[offset] & 0xfc) == 0x64 /* fs, gs, addr32, data16 override */
|| (data.bytes[offset] & 0xf0) == 0x40); /* REX prefix */
offset++)
;
/* instruction is composed of all prefixes */
if (offset > 15)
return (0);
opcode = data.bytes[offset];
/* E8: CALL NEAR rel32? */
if (opcode == 0xe8)
return (1);
/* sufficient space for modr/m byte? */
if (offset > 14)
return (0);
reg = data.bytes[offset + 1] & 0070; /* modr/m byte, reg field */
if (opcode == 0xff) {
/* FF /2: CALL NEAR r/m64? */
if (reg == 0020)
return (1);
/* FF /3: CALL FAR r/m32 or r/m64? */
if (reg == 0030)
return (1);
}
/* not a CALL instruction */
return (0);
}
Upvotes: 5