I am just learning Assembler and debugging skills in OllyDbg in order to learn how to use undocumented functions. Now I am having the following problem:
I have the following code part (from OllyDbg):
MOV EDI,EDI
PUSH EBP
MOV EBP,ESP
MOV EAX, DWORD PTR SS:[EBP+8]
XOR EDX,EDX
LEA ECX, DWORD PTR DS:[EAX+4]
MOV DWORD PTR DS:[EAX], EDX
MOV DWORD PTR DS:[ECX+4],ECX
MOV DWORD PTR DS:[ECX],ECX
MOV DWORD PTR DS:[EAX+C],ECX
MOV ECX, DWORD PTR SS:[EBP+C]
This is the beginning of the function and the goal is to find the data structure. So I figured out that it first pushes the EBP on the stack and then move the ESP (current stack pointer) to EBP where I think it now defines a stack frame for the function. Now the tutorial says that in the popular layout the first argument is placed at [EBP+8] and the second at [EBP+C]
This is what I do not understand. How do I know that the first parameter is placed at EBP+8 ?
Hopefully someone can help me! Thanks!
What kind of "undocumented functions" do you mean? Assembly is just compiled high-level code most of the time. There's hardly anything "undocumented" about it.
EBP
is most often used as the stack frame pointer in functions, most notably in the C calling convention (also known by the name cdecl
). With this convention, the parameters are passed on the stack in reverse order (e.g. the last parameter is pushed first), and the called function uses EBP
to access them. Based on the code you posted, I think the data structure might be pointed to by the first parameter. Have a look :
MOV EAX, DWORD PTR SS:[EBP+8]
LEA ECX, DWORD PTR DS:[EAX+4]
MOV DWORD PTR DS:[EAX], EDX
MOV DWORD PTR DS:[ECX+4],ECX
MOV DWORD PTR DS:[ECX],ECX
MOV DWORD PTR DS:[EAX+C],ECX
MOV ECX, DWORD PTR SS:[EBP+C]
The first instruction moves the first argument into EAX
. Then an offset of 4 is added to that argument and moved into ECX
. Note that this is done by the LEA
instruction, which is shorthand for "Load Effective Address". It is used for unsigned arithmetic and compilers like to use it when doing pointer arithmetic and adding offsets - so whenever you see this instruction, you should be alarmed that whatever it operates on might be a pointer to a structure. Of course, there's no way to know for sure. Later on we have some MOV
s to and from that address, where ECX
is used to access memory. The structures, if they exist, would look something like this in C :
struct a { /* pointed to by EAX / [EBP+8] */
int memb1; /* MOV DWORD PTR DS:[EAX], EDX */
struct b* memb2; /* LEA ECX, DWORD PTR DS:[EAX+4] */
int memb3; /* unused? */
int memb4; /* MOV DWORD PTR DS:[EAX+C],ECX */
};
struct b {
int memb1; /* MOV DWORD PTR DS:[ECX],ECX */
int memb2; /* MOV DWORD PTR DS:[ECX+4],ECX */
};
Hope this clears things up somehow. Reverse-engineering assembly code is a very hard and time-consuming task, especially if you don't have any API calls which would tell you the type of arguments used by the application.