Reputation: 73
Ok so im trying to creat a function that creates shellcode.
Im having alot of problems working out the rex / mod stuff.
My current code kind of works.
So far if the regs are smaller then R8 it works fine.
If i use one reg that is smaller then R8 its fine.
Problem is once i have to regs smaller then r8 and are the same or if the src is smaller i get problems
enum Reg64 : uint8_t {
RAX = 0, RCX = 1, RDX = 2, RBX = 3,
RSP = 4, RBP = 5, RSI = 6, RDI = 7,
R8 = 8, R9 = 9, R10 = 10, R11 = 11,
R12 = 12, R13 = 13, R14 = 14, R15 = 15
};
inline uint8_t encode_rex(uint8_t is_64_bit, uint8_t extend_sib_index, uint8_t extend_modrm_reg, uint8_t extend_modrm_rm) {
struct Result {
uint8_t b : 1;
uint8_t x : 1;
uint8_t r : 1;
uint8_t w : 1;
uint8_t fixed : 4;
} result{ extend_modrm_rm, extend_modrm_reg, extend_sib_index, is_64_bit, 0b100 };
return *(uint8_t*)&result;
}
inline uint8_t encode_modrm(uint8_t mod, uint8_t rm, uint8_t reg) {
struct Result {
uint8_t rm : 3;
uint8_t reg : 3;
uint8_t mod : 2;
} result{ rm, reg, mod };
return *(uint8_t*)&result;
}
inline void mov(Reg64 dest, Reg64 src) {
if (dest >= 8)
put<uint8_t>(encode_rex(1, 2, 0, 1));
else if (src >= 8)
put<uint8_t>(encode_rex(1, 1, 0, 2));
else
put<uint8_t>(encode_rex(1, 0, 0, 0));
put<uint8_t>(0x89);
put<uint8_t>(encode_modrm(3, dest, src));
}
//c.mov(Reg64::RAX, Reg64::RAX); // works
//c.mov(Reg64::RAX, Reg64::R9); // works
//c.mov(Reg64::R9, Reg64::RAX); // works
//c.mov(Reg64::R9, Reg64::R9); // Does not work returns (mov r9,rcx)
Also if there is a shorter way to do this without all the if's that would be great.
Upvotes: 2
Views: 224
Reputation: 365352
FYI, most people create shellcode by assembling with a normal assembler like NASM, then hexdumping that binary into a C string. Writing your own assembler can be a fun project but is basically a separate project.
Your encode_rex
looks somewhat sensible, taking four args for the four bits. But the code in mov
that calls it passes a 2
sometimes, which will truncate to 0
!
Also, there are 4 possibilities for the 2 relevant extension bits (b and x) you're using for reg-reg moves. But your if/else if/else chain only covers 3 of them, ignoring the possibility of dest>=8 && src >= 8
=> x:b = 3
Since those two bits are orthogonal, you should just calculate them separately like this:
put<uint8_t>(encode_rex(1, 0, dest>=8, src>=8));
The SIB-index x
field should always be 0
because you don't have a SIB byte, just ModRM for a reg-reg mov
.
You have your struct initializer in encode_rex
mixed up, with extend_modrm_reg
being 2nd where it will initialize the x
field instead of r
. Your bitfield names match https://wiki.osdev.org/X86-64_Instruction_Encoding#Encoding, but you have the wrong C++ variables initializing them. See that link for descriptions.
Possibly I have the dest, src order backwards, depending on whether you're using the mov r/m, r
or the mov r, r/m
opcode. I didn't double-check which is which.
Sanity check from NASM: I assembled with nasm -felf64 -l/dev/stdout
to get a listing:
1 00000000 4889C8 mov rax, rcx
2 00000003 4889C0 mov rax, rax
3 00000006 4D89C0 mov r8, r8
4 00000009 4989C0 mov r8, rax
5 0000000C 4C89C0 mov rax, r8
You're using the same 0x89
opcode that NASM uses, so your REX prefixes should match.
return *(uint8_t*)&result;
is strict-aliasing UB and not safe outside of MSVC.
Use memcpy to safely type-pun. (Or a union; most real-world C++ compilers including gcc/clang/MSVC do define the behaviour of union type-punning as in C99, unlike ISO C++).
Upvotes: 1