Fun with Buffer Overflows

A question was asked on stackoverflow.com about using a buffer overflow to clobber the stack and alter the behavior of a program. It’s been a while since I’ve done anything like this so I thought I might give it a shot. This code is based on code snippets shown in the stackoverflow thread. This is very architecture and compiler dependent. If you are following along you may have to adjust accordingly.

#include <stdio.h>
void foo()
{
	int* p;
	p = (int*)&p;
	p[offset] = new return address;
	return;
}
int main(int argc, char* argv[])
{
	int x = 0;
	foo();
	x++;
	printf("%dn", x);
	return 0;
}

The idea behind this code is to clobber the stack in function foo() so when foo() returns, the x++ is skipped and 0 is printed. We have two things we need to do to make this happen. First, when a function is called, the address of the instruction immediately following the function is pushed on the stack. This is the address control jumps to when the function is finished executing and returns to the caller. It is this address on the stack that we need to change. Second, we need to know where to return to. To find this value, we need to look at the object code that is created by the compiler. If you are trying this at home, your compiler and produced assembly may be different as the produced output can be very version specific. For this example I’m using Cygwin with gcc version 3.4.4. The following is the disassembly of the object code for the main() function:

// _main
//  401071:       55                      push   %ebp
//  401072:       89 e5                   mov    %esp,%ebp
//  401074:       83 ec 18                sub    $0x18,%esp
//  ... skip unimportant stuff ...
//  40109b:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%ebp)
//  4010a2:       e8 a9 ff ff ff          call   401050 <_foo>
//  4010a7:       8d 45 fc                lea    -0x4(%ebp),%eax
//  4010aa:       ff 00                   incl   (%eax)
//  4010ac:       8b 45 fc                mov    -0x4(%ebp),%eax
//  4010af:       89 44 24 04             mov    %eax,0x4(%esp)
//  4010b3:       c7 04 24 00 20 40 00    movl   $0x402000,(%esp)
//  4010ba:       e8 a9 00 00 00          call   401168 <_printf>
//  4010bf:       b8 00 00 00 00          mov    $0x0,%eax
//  4010c4:       c9                      leave
//  4010c5:       c3                      ret

This disassembly is produced using the objdump command objdump -d test.exe. Breaking this down we see: Standard preamble stuff. Don’t worry about this. (yet)

//  401071:       55                      push   %ebp
//  401072:       89 e5                   mov    %esp,%ebp

Allocate space on the stack for our local variables (including x). This also allocates a temporary variable to store the address of our format string constant which we will come back to.

//  401074:       83 ec 18                sub    $0x18,%esp

Set x to 0.

//  40109b:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%ebp)

Call foo().

//  4010a2:       e8 a9 ff ff ff          call   401050 <_foo>

When this instruction is executed, the address of the next instruction is pushed onto the stack. In this case it is the address of the lea instruction 4010a7. The next block of instructions increment x. This is the block we want to skip.

//  4010a7:       8d 45 fc                lea    -0x4(%ebp),%eax
//  4010aa:       ff 00                   incl   (%eax)

These two instructions take up a total of 5 bytes. 3 for the lea instructions and 2 for the incl instruction. Once we find the return address on the stack, we need to increment it by 5. Our updated foo() looks like this:

void foo()
{
	int* p;
	p = (int*)&p;
	p[offset] += 5;
	return;
}

Note, we don’t want to skip these two instructions. They are responsible for putting a copy of x into a parameter already allocated on the stack to be passed to printf().

//  4010ac:       8b 45 fc                mov    -0x4(%ebp),%eax
//  4010af:       89 44 24 04             mov    %eax,0x4(%esp)

This instruction copies the address of our format string into the last reserved variable on the stack which is the first parameter to be passed to printf().

//  4010b3:       c7 04 24 00 20 40 00    movl   $0x402000,(%esp)

Finally, we call printf(), set our return value and exit the application.

//  4010ba:       e8 a9 00 00 00          call   401168 <_printf>
//  4010bf:       b8 00 00 00 00          mov    $0x0,%eax
//  4010c4:       c9                      leave
//  4010c5:       c3                      ret

The last thing we need to find is where the return address is stored on the stack. Let’s look at the disassembly of foo().

//  401050:       55                      push   %ebp
//  401051:       89 e5                   mov    %esp,%ebp
//  401053:       83 ec 04                sub    $0x4,%esp
//  401056:       8d 45 fc                lea    -0x4(%ebp),%eax
//  401059:       89 45 fc                mov    %eax,-0x4(%ebp)
//  40105c:       8b 55 fc                mov    -0x4(%ebp),%edx
//  40105f:       83 c2 08                add    $0x8,%edx
//  401062:       8b 45 fc                mov    -0x4(%ebp),%eax
//  401065:       83 c0 08                add    $0x8,%eax
//  401068:       8b 00                   mov    (%eax),%eax
//  40106a:       83 c0 05                add    $0x5,%eax
//  40106d:       89 02                   mov    %eax,(%edx)
//  40106f:       c9                      leave
//  401070:       c3                      ret

The first two instructions are the preamble which we’ve seen before with main().

//  401050:       55                      push   %ebp
//  401051:       89 e5                   mov    %esp,%ebp

This pushes the previous base pointer onto the stack and makes the current stack pointer the current base pointer. We know when this function was called, the return address was pushed. It is now “under” the base pointer we just pushed. Next, storage for our local variable p is allocated on the stack.

//  401053:       83 ec 04                sub    $0x4,%esp

Our stack should now look something like this:

//         ret addr
//         old ebp
// esp --> p

Next we store the address of p into p so that p is pointing to itself.

//  401056:       8d 45 fc                lea    -0x4(%ebp),%eax
//  401059:       89 45 fc                mov    %eax,-0x4(%ebp)

Now, with p pointing to itself, let’s have another look at the stack and see how we should offset p to get to the return address. Our stack should now look something like this:

//         ret addr <-- p + 8
//         old ebp  <-- p + 4
// esp --> p        <-- p

Note that I am using a 32-bit system so addresses and registers are stored as 4 byte values. Whatever address p is pointing to, we have to add 8 to that address. In foo() I declared p as an int * so we can treat p as an array of integers and access index 2 to clobber the return address:

void foo()
{
	int* p;
	p = (int*)&p;
	p[2] += 5;
	return;
}

Running this code produces the following output:

$ ./test.exe

]]>

Related Posts

‘eregi’ is deprecated errors

Replace Non-Alphanumeric Characters in a C++ String

Leave a Reply Cancel reply