Reputation: 33
I am trying to understand this problem that is in ASM. Here is the code:
45 33 C9 xor r9d, r9d
C7 44 24 18 50 72 69 6D mov [rsp+arg_10], 6D697250h
66 C7 44 24 1C 65 53 mov [rsp+arg_14], 5365h
C6 44 24 1E 6F mov [rsp+arg_16], 6Fh
4C 63 C1 movsxd r8, ecx
85 C9 test ecx, ecx
7E 1C jle short locret_140001342
41 8B C9 mov ecx, r9d
loc_140001329:
48 83 F9 07 cmp rcx, 7
49 0F 4D C9 cmovge rcx, r9
48 FF C1 inc rcx
8A 44 0C 17 mov al, [rsp+rcx+arg_F]
30 02 xor [rdx], al
48 FF C2 inc rdx
49 FF C8 dec r8
75 E7 jnz short loc_140001329
locret_140001342:
C3 retn
And here is the encoded text:
07 1D 1E 41 45 2A 00 25 52 0D 04 01 73 06
24 53 49 39 0D 36 4F 35 1F 08 04 09 73 0E
34 16 1B 08 16 20 4F 39 01 49 4A 54 3D 1B
35 00 07 5C 53 0C 08 1E 38 11 2A 30 13 1F
22 1B 04 08 16 3C 41 33 1D 04 4A
I've been studying ASM for some time now and I know what most of the commands but I still have some questions I have not found the answer to.
How do i plug the encoded text into the algorithm?
What are arg_10, arg_14, etc? I assume they are from the encoded part but I dont know exatcly.
Could someone go line by line what this algorithm does, I understand some of it but I need some clarification.
I have been using visual studio and c++ to test asm. I do know that to run an asm procedure you can declare a function like this
extern "C" int function(int a, int b, int c,int d, int f, int g);
and use it like this
printf("ASM Returned %d", function(92,2,3,4,5,6));
I am also aware that the first four parameters go into int RCX, RDX, R8, and R9 and the rest are on the stack. I don't know much about the stack so I do not know how to access them right now. I also know that the returned value is the value contained by RAX. So a something like this would add two numbers:
xor eax, eax
mov eax, ecx
add eax, edx
ret
So as Jester suggested, I will go line by line explaining what I think the code does.
xor r9d, r9d //xor on r9d (clears the register)
mov [rsp+arg_10], 6D697250h //moves 6D697250 to the address pointed at by rsp + arg_10
mov [rsp+arg_14], 5365h //moves 5365 to the adress pointed at by rsp+arg_14
mov [rsp+arg_16], 6Fh //moves 6F to the adress pointed at by rsp+arg_16
movsxd r8, ecx //moves ecx, to r8 and sign extends it since exc is 32 bit and r8 is 64 bit
test ecx, ecx //tests exc and sets the labels
jle short locret_140001342 //jumps to ret if ecx is zero or less
mov ecx, r9d //moves the lower 32 bits or r9 into ecx
loc_140001329: //label used by jump commands
cmp rcx, 7 //moves 7(decimal) into rcx
cmovge rcx, r9 //don't know
inc rcx //increases rcx by 1
mov al, [rsp + rcx + arg_F] //moves the the value at adress [rsp + rcx + arg_F] into al,
//this is probably the key step as al is 1 byte and each character is also one byte, it is also the rax register so it holds the value to be returned
xor [rdx], al //xor on the value at address [rdx] and al, stores the result at the address of [rdx]
inc rdx //increase rdx by 1
dec r8 //decrease r8 by 1
jnz short loc_140001329 //if r8 is not zero jump back to loc_140...
//this essentially is a while loop until r8 reaches 0 (assuming it starts as positive)
locret_140001342:
ret
I still don't know what the arg_xx are or how exactly is the encoded text plugged into this algorithm.
Upvotes: 2
Views: 1565
Reputation: 33
Ok i have figured out the algorithm and have made it work in ASM as well. You guys were right, the arg_xx were offsets. arg_10 == 0x10, arg_f == 0x0f. The data is passed in as an array with the length of it. So rcx will be the data length in this case 47, and rdx will point to the beginning of the array. Here is the function I used in c++ to call the ASM procedure.
extern "C" void function(int length, char* message);
The algorithm is pretty simple. The key phrase is "PrimeSo". All it does is do a XOR operation on each value passed in with one of the values in "PrimeSo" in increasing order, once it reaches the 'o' in "PrimeSo" it goes back to 'P'. Hence
cmp rcx, 7
cmovge rcx, r9 //as Peter de Rivaz stated this will put 0 into rcx if it is greater or equal to seven
inc rcx
and so
mov al, [rsp + rcx + 0Fh]
will effectively become [rsp + 1 + 0fh], [rsp + 2 + 0Fh], ..., [rsp + 7 + 0Fh]. Note that "PrimeSo" was stored at [rsp + 10h] meaning that [rsp + 1 + 0Fh] points to 'P'. In each iteration of the loop, al will become one of the characters in "PrimeSo" and it will cycle through them.
xor [rdx], al //This will do an xor operation on [rdx](begining of our message) and al wich is 'P' in the first loop.
//It will then store the result in it's place.
inc rdx //move to next character
dec r8 //decrease counter
jnz short loc_140001329 //and start the loop again
With that being said lets look at the first few ones.
xor P, 07 == xor 50, 07 --> 57 = W
xor r, 1D == xor 72, 1D --> 6F = o
xor i, 1E == xor 69, 1E --> 77 = w
xor m, 41 == xor 6D, 41 --> 2C = ,
For those wondering here is the C++ code:
#include <fstream>
extern "C" void function(int length, char* message);
int main()
{
char message[] = { 0x07, 0x1D, 0x1E, 0x41, 0x45, 0x2A, 0x00, 0x25, 0x52, 0x0D, 0x04, 0x01, 0x73, 0x06, 0x24, 0x53, 0x49, 0x39, 0x0D, 0x36, 0x4F, 0x35, 0x1F, 0x08, 0x04, 0x09, 0x73, 0x0E, 0x34, 0x16, 0x1B, 0x08, 0x16, 0x20, 0x4F, 0x39, 0x01, 0x49, 0x4A, 0x54, 0x3D, 0x1B, 0x35, 0x00, 0x07, 0x5C, 0x53, 0x0C, 0x08, 0x1E, 0x38, 0x11, 0x2A, 0x30, 0x13, 0x1F, 0x22, 0x1B, 0x04, 0x08, 0x16, 0x3C, 0x41, 0x33, 0x1D, 0x04, 0x4A, '\0'};
function(sizeof(message) - 1, message);
printf("Decoded Message is:\n%s\n", message);
printf("\n");
system("pause");
return 0;
}
No I did not manually insert the data into message. Also note that I added a string terminator at the end and used sizeof(message) - 1 to avoid decoding the string terminator.
Here is the ASM code, this is simply a new file called assembly.asm and has this in it.
.code
function proc
xor r9d, r9d
mov dword ptr [rsp + 18h], 6D697250h
mov word ptr [rsp + 1Ch], 5365h
mov byte ptr [rsp + 1Eh], 6Fh
movsxd r8, ecx
test ecx, ecx
jle short locret_140001342
mov ecx, r9d
loc_140001329:
cmp rcx, 7
cmovge rcx, r9
inc rcx
mov al, [rsp + rcx + 17h]
xor [rdx], al
inc rdx
dec r8
jnz short loc_140001329
locret_140001342:
ret
function endp
end
In visual studio, you can add a breakpoint in here and go to debug->windows->registers and debug->windows->memory-memory 1 to see the registers and the program's memory. Note that rcx will contain the count, and rdx will point to the beginning of the encoded message.
Thank you all for your help and suggestions, I couldn't of done it without you.
Upvotes: 1
Reputation: 33509
I think your understanding is largely correct, a few minor corrections:
test ecx, ecx //tests exc and sets the labels
This sets the flags (not the labels).
cmp rcx, 7 //moves 7(decimal) into rcx
This compares rcx to the immediate value 7, and sets the flags accordingly. (i.e. after this instruction a conditional instruction such as gt will only execute if rcx was greater than 7.)
cmovge rcx, r9 //don't know
This conditionally (based on the flags you have just set) moves r9 into rcx. The condition is ge, so this instruction only executes if rcx was greater than or equal to 7. r9 contains 0, so the effect of this is to set rcx back to 0 when it reaches 7.
You are not given information on the parameters to the function, but it seems safe to assume that rcx is the original length of the data to be decrypted, and rdx is a pointer to the data.
Upvotes: 1
Reputation: 34575
Here is my take on the code.
; rdx holds the message location
; ecx holds the message length
xor r9d, r9d ; r9d = 0
mov [rsp+arg_10], 6D697250h ; fix up the key
mov [rsp+arg_14], 5365h
mov [rsp+arg_16], 6Fh ; which is "PrimeSo"
movsxd r8, ecx ; length counter
test ecx, ecx ; test the message length
jle short locret_140001342 ; skip if invalid length
mov ecx, r9d ; reset key index to 0
loc_140001329:
cmp rcx, 7 ; check indexing of key
cmovge rcx, r9 ; reset if o/range
inc rcx ; obfusacte by incrementing first
mov al, [rsp+rcx+arg_F] ; ... and indexing wrong offset
xor [rdx], al ; encrypt the message byte
inc rdx ; advance message pointer
dec r8 ; loop count
jnz short loc_140001329 ; next message byte
locret_140001342:
retn
I decoded the message with a C program implementing the algorithm, but that would be too easy, so I won't post it.
Reverse engineering
The code does not contain enough information to solve it top-down, because some registers are used without being loaded, and labels are not defined. I solved it bottom-up, by identifying the instruction that does the encryption, and working out from there.
Although the stack labels are not defined, the nomenclature is enough of a clue to show that the parts of the key are actually consecutive, and the assumption of little-endian reveals the key. This is confirmed looking at the hex byte tabulation, which shows the three values being stored at offsets' lsb of 18
, 1C
and 1E
Upvotes: 2
Reputation: 38472
one thing I noticed is that the values being stored at those stack offsets are ASCII:
>>> '5072696d65536f'.decode('hex')
'PrimeSo'
as for entering the data, you could use xxd -r -p
and read it from stdin in the program: xxd -r -p data.hex | ./myprog
those arg_14
etc. offsets have to be declared somewhere in the sources. but I would guess they're hex offsets 0xf, 0x10, 0x14, 0x16.
Upvotes: 1