Author: David Rager
Date: June 2, 2011
A question was asked on stackoverflow.com about using a buffer overflow to clobber the stack and alter the behavior of a program. It’s been a while since I’ve done anything like this so I thought I might give it a shot. This code is based on code snippets shown in the stackoverflow thread. This is very architecture and compiler dependent. If you are following along you may have to adjust accordingly.
#include <stdio.h>
void foo()
{
int* p;
p = (int*)&p;
p[offset] = new return address;
return;
}
int main(int argc, char* argv[])
{
int x = 0;
foo();
x++;
printf("%dn", x);
return 0;
}
The idea behind this code is to clobber the stack in function foo()
so when foo()
returns, the x++
is skipped and 0 is printed. We have two things we need to do to make this happen. First, when a function is called, the address of the instruction immediately following the function is pushed on the stack. This is the address control jumps to when the function is finished executing and returns to the caller. It is this address on the stack that we need to change. Second, we need to know where to return to. To find this value, we need to look at the object code that is created by the compiler. If you are trying this at home, your compiler and produced assembly may be different as the produced output can be very version specific. For this example I’m using Cygwin with gcc version 3.4.4. The following is the disassembly of the object code for the main()
function:
// _main
// 401071: 55 push %ebp
// 401072: 89 e5 mov %esp,%ebp
// 401074: 83 ec 18 sub $0x18,%esp
// ... skip unimportant stuff ...
// 40109b: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
// 4010a2: e8 a9 ff ff ff call 401050 <_foo>
// 4010a7: 8d 45 fc lea -0x4(%ebp),%eax
// 4010aa: ff 00 incl (%eax)
// 4010ac: 8b 45 fc mov -0x4(%ebp),%eax
// 4010af: 89 44 24 04 mov %eax,0x4(%esp)
// 4010b3: c7 04 24 00 20 40 00 movl $0x402000,(%esp)
// 4010ba: e8 a9 00 00 00 call 401168 <_printf>
// 4010bf: b8 00 00 00 00 mov $0x0,%eax
// 4010c4: c9 leave
// 4010c5: c3 ret
This disassembly is produced using the objdump
command objdump -d test.exe
. Breaking this down we see: Standard preamble stuff. Don’t worry about this. (yet)
// 401071: 55 push %ebp
// 401072: 89 e5 mov %esp,%ebp
Allocate space on the stack for our local variables (including x). This also allocates a temporary variable to store the address of our format string constant which we will come back to.
// 401074: 83 ec 18 sub $0x18,%esp
Set x to 0.
// 40109b: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
Call foo()
.
// 4010a2: e8 a9 ff ff ff call 401050 <_foo>
When this instruction is executed, the address of the next instruction is pushed onto the stack. In this case it is the address of the lea
instruction 4010a7
. The next block of instructions increment x. This is the block we want to skip.
// 4010a7: 8d 45 fc lea -0x4(%ebp),%eax
// 4010aa: ff 00 incl (%eax)
These two instructions take up a total of 5 bytes. 3 for the lea instructions and 2 for the incl instruction. Once we find the return address on the stack, we need to increment it by 5. Our updated foo() looks like this:
void foo()
{
int* p;
p = (int*)&p;
p[offset] += 5;
return;
}
Note, we don’t want to skip these two instructions. They are responsible for putting a copy of x
into a parameter already allocated on the stack to be passed to printf()
.
// 4010ac: 8b 45 fc mov -0x4(%ebp),%eax
// 4010af: 89 44 24 04 mov %eax,0x4(%esp)
This instruction copies the address of our format string into the last reserved variable on the stack which is the first parameter to be passed to printf()
.
// 4010b3: c7 04 24 00 20 40 00 movl $0x402000,(%esp)
Finally, we call printf(), set our return value and exit the application.
// 4010ba: e8 a9 00 00 00 call 401168 <_printf>
// 4010bf: b8 00 00 00 00 mov $0x0,%eax
// 4010c4: c9 leave
// 4010c5: c3 ret
The last thing we need to find is where the return address is stored on the stack. Let’s look at the disassembly of foo()
.
// 401050: 55 push %ebp
// 401051: 89 e5 mov %esp,%ebp
// 401053: 83 ec 04 sub $0x4,%esp
// 401056: 8d 45 fc lea -0x4(%ebp),%eax
// 401059: 89 45 fc mov %eax,-0x4(%ebp)
// 40105c: 8b 55 fc mov -0x4(%ebp),%edx
// 40105f: 83 c2 08 add $0x8,%edx
// 401062: 8b 45 fc mov -0x4(%ebp),%eax
// 401065: 83 c0 08 add $0x8,%eax
// 401068: 8b 00 mov (%eax),%eax
// 40106a: 83 c0 05 add $0x5,%eax
// 40106d: 89 02 mov %eax,(%edx)
// 40106f: c9 leave
// 401070: c3 ret
The first two instructions are the preamble which we’ve seen before with main()
.
// 401050: 55 push %ebp
// 401051: 89 e5 mov %esp,%ebp
This pushes the previous base pointer onto the stack and makes the current stack pointer the current base pointer. We know when this function was called, the return address was pushed. It is now “under” the base pointer we just pushed. Next, storage for our local variable p
is allocated on the stack.
// 401053: 83 ec 04 sub $0x4,%esp
Our stack should now look something like this:
// ret addr
// old ebp
// esp --> p
Next we store the address of p
into p
so that p
is pointing to itself.
// 401056: 8d 45 fc lea -0x4(%ebp),%eax
// 401059: 89 45 fc mov %eax,-0x4(%ebp)
Now, with p
pointing to itself, let’s have another look at the stack and see how we should offset p
to get to the return address. Our stack should now look something like this:
// ret addr <-- p + 8
// old ebp <-- p + 4
// esp --> p <-- p
Note that I am using a 32-bit system so addresses and registers are stored as 4 byte values. Whatever address p is pointing to, we have to add 8 to that address. In foo()
I declared p
as an int *
so we can treat p
as an array of integers and access index 2 to clobber the return address:
void foo()
{
int* p;
p = (int*)&p;
p[2] += 5;
return;
}
Running this code produces the following output:
$ ./test.exe
]]>
Author: David Rager
Date: March 6, 2011
So, I’ve been preparing for some job interviews lately. Of the highly technical nature. These are not the ones where they ask you what you’ve been working on or what your experience is in. These are the ones where they ask you questions to test the depth of your knowledge in Computer Science, algorithms, data structures, theory and practice. Stuff you learned as a first and second year undergrad and haven’t actually put into practice much. Not because you didn’t need to use these concepts or they were of no real importance. Rather because most languages and libraries have these things built in and you don’t normally have to implement them except in the most extreme cases.
I started going over my old college notes (yes, I still have them) to refresh my memory. I was surprised by how much I had forgotten. This post I’ll start with sorting algorithms, mainly as a refresher for me and hopefully this proves helpful to someone else. Each algorithm is implemented as a C function that takes two parameters. The first parameter is an array of integers. The second parameter is the length (number of elements) of the array.
Note that some of these examples do some things that break the actual efficiency of the algorithm (for instance using memmove() in insertion sort). This is done to focus simply on the sort algorithm without having to get into too much detail about implementing more efficient data structures that are more suited to the algorithm. Those instances are noted below.
Bubble Sort
The first sort is probably the simplest and most straight forward of all the sorts. It is the bubble sort. The bubble sort starts at the beginning of the array and compares the first two values. If the first value is greater than the second, they are swapped. It then compares the second and third elements swapping them if needed and continues with each pair of adjacent values in the array. Once the first pass is complete, the largest value is in the last position in the array. The algorithm starts again and repeats until no values need to be swapped. Bubble sort has an average and worse case performance complexity of `0(n^2)`. Because of this poor performance it is not very useful for large data sets. However, it is easy to understand and easy to implement so it can be useful on smaller data sets in situations where efficiency isn’t terribly important.
void bubble_sort(int* arr, int len)
{
int i, j, swp;
for(i = 0; i < len; i++)
{
for(j = 0; j < len - i - 1; j++)
{
if(arr[j] > arr[j + 1])
{
swp = arr[j];
arr[j] = arr[j + 1];
arr[j + 1] = swp;
}
}
}
}
Selection Sort
The selection sort starts at the beginning of the data and scans for (selects) the minimum value. Once the minimum value is found, it is swapped with the value in the first position. The algorithm starts again at the second position and repeats. Selection sort also has performance complexity of `0(n^2)`. However, since it performs no more than `n` swaps it performs better than bubble sort.
void selection_sort(int* arr, int len)
{
int i, j, si, swp;
for(i = 0; i < len - 1; i++)
{
si = i;
for(j = i + 1; j < len; j++)
{
if(arr[j] < arr[si])
si = j;
}
if(si != i)
{
swp = arr[i];
arr[i] = arr[si];
arr[si] = swp;
}
}
}
Insertion Sort
Insertion sort is similar to selection sort in that it selects each value and inserts it into its proper place. When an element is inserted into its position, all the following elements must be shifted by one. This algorithm is useful in instances where insertion is fast such as with linked lists or in data that is already mostly sorted. Note that this implementation I use an array using memmove() to make room for the insert. I do this for simplicity to demonstrate the algorithm. I wouldn’t recommend using this on an array in practice.
void insertion_sort(int* arr, int len)
{
int i, j, si, swp;
for(i = 0; i < len - 1; i++)
{
si = i;
for(j = i + 1; j < len; j++)
{
if(arr[j] < arr[si])
si = j;
}
if(si != i)
{
swp = arr[si];
memmove(&arr[i + 1], &arr[i], (si - i) * sizeof(int));
arr[i] = swp;
}
}
}
Shell Sort
The shell sort is a more advanced sorting algorithm that arranges the input data into a series of columns and then sorts each column using insertion sort. By sorting data in columns, the data is able to be moved over greater distances more quickly than would happen with the standard insertion sort. Here is the algorithm written in C. Following, I’ll walk through an application of the algorithm on a random data set.
void shell_sort(int* arr, int len)
{
int i, j, gap, swp;
gap = len / 2;
while(gap > 0)
{
for(i = gap; i < len; i++)
{
swp = arr[i];
j = i;
while(j >= gap && arr[j - gap] > swp)
{
arr[j] = arr[j - gap];
j -= gap;
}
arr[j] = swp;
}
gap = gap / 2;
}
}
The random data set is an array of 15 randomly generated numbers. We first calculate the gap and arrange the data into columns based on the gap. amath [33, 38, 73, 29, 6, 72, 71, 19, 53, 57, 43, 38, 5, 83, 7]
l = 15, g = l / 2 = 7
[[33, 38, 73, 29, 6, 72, 71],[19, 53, 57, 43, 38, 5, 83],[7,,,,,,]]
endamath
In this algorithm we start in the first column and save the 19 in swp. We then check each value in the column above the 19 and if it is greater, move the number down the column by one. Finally we put the number in its correct position, in this case at the beginning of the column which will produce the following:
amath
[[19, 38, 73, 29, 6, 72, 71],[33, 53, 57, 43, 38, 5, 83],[7,,,,,,]]
endamath
We repeat for each column:
amath
[[19, 38, 57, 29, 6, 5, 71],[33, 53, 73, 43, 38, 72, 83],[7,,,,,,]]
endamath
Note that `j` began with the second element in each column and wraps to the third row (then 4th, etc… depending on how many elements are in each column) which looks at the `7` and puts it in its place.
amath
[[7, 38, 57, 29, 6, 5, 71],[19, 53, 73, 43, 38, 72, 83],[33,,,,,,]]
[7, 38, 57, 29, 6, 5, 71, 19, 53, 73, 43, 38, 72, 83, 33]
endamath
Notice now that the `7` which was last is now in the first position. Values have the potential to jump great distances quite quickly. Now we recalculate the gap and start again.
amath
g = g / 2 = 3
[[7, 38, 57], [29, 6, 5], [71, 19, 53], [73, 43, 38], [72, 83, 33]] => [[7, 6, 5], [29, 38, 57], [71, 19, 53], [73, 43, 38], [72, 83, 33]] => [[7, 6, 5], [29, 19, 53], [71, 38, 57], [73, 43, 38], [72, 83, 33]] => [[7, 6, 5], [29, 19, 38], [71, 38, 53], [73, 43, 57], [72, 83, 33]] => [[7, 6, 5], [29, 19, 33], [71, 38, 38], [72, 43, 53], [73, 83, 57]]
[7, 6, 5, 29, 19, 33, 71, 38, 38, 72, 43, 53, 73, 83, 57]
endamath
Finally, when gap is `1`, shell sort is essentially an insertion sort on a mostly sorted array for which it performs well.
amath
g = g / 2 = 1
[[7], [6], [5], [29], [19], [33], [71], [38], [38], [72], [43], [53], [73], [83], [57]] => [[6], [7], [5], [29], [19], [33], [71], [38], [38], [72], [43], [53], [73], [83], [57]] => [[5], [6], [7], [29], [19], [33], [71], [38], [38], [72], [43], [53], [73], [83], [57]] => [[5], [6], [7], [19], [29], [33], [71], [38], [38], [72], [43], [53], [73], [83], [57]] => [[5], [6], [7], [19], [29], [33], [38], [71], [38], [72], [43], [53], [73], [83], [57]] => [[5], [6], [7], [19], [29], [33], [38], [38], [71], [72], [43], [53], [73], [83], [57]] => [[5], [6], [7], [19], [29], [33], [38], [38], [43], [71], [72], [53], [73], [83], [57]] => [[5], [6], [7], [19], [29], [33], [38], [38], [43], [53], [71], [72], [73], [83], [57]] => [[5], [6], [7], [19], [29], [33], [38], [38], [43], [53], [57], [71], [72], [73], [83]]
[5, 6, 7, 19, 29, 33, 38, 38, 43, 53, 57, 71, 72, 73, 83]
endamath
Merge Sort
Merge sort is an `O(nlogn)` sorting algorithm that gets its name from merging two already sorted list into a single sorted list. The algorithm begins by dividing the data in two and recursively sorting the two sub lists. When the recursive calls return, the sub lists are sorted and can then be merged into a final list.
Note, this implementation is inefficient using an inplace array and memmove. In practice, merge sort works best where insertions are inexpensive such as linked lists or copying the two sub arrays into a separate large array.
void merge_sort(int* arr, int len)
{
int *l, *r;
int i, swp, lenl, lenr;
if(len <= 1)
return;
lenl = len / 2;
lenr = len - lenl;
l = arr;
r = arr + lenl;
merge_sort(l, lenl);
merge_sort(r, lenr);
for(i = 0; i < len && lenl > 0 && lenr > 0; i++)
{
if(*l > *r)
{
swp = *r;
memmove(&arr[i + 1], &arr[i], (r - &arr[i]) * sizeof(int));
arr[i] = swp;
r++; lenr--;
l++;
}
else
{
l++; lenl--;
}
}
}
Heap Sort
Heap sort uses a heap data structure to determine the minimum or maximum value in the data. The data is first placed in a heap structure, in this example a max-heap. Once in this structure, the largest value is at the top of the heap. The heap here is represented in the array where array[0] is the root node, array[1] is the left child, array [2] is the right child, array [3] is the left child of array[1], array[4] is the right child of array[1], etc…
Once the data is heapified, the largest value (root) is swapped with the last value in the array (its final position) and the array is again heapified (excluding the value in now in its correct position). This process is repeated until the array is completely sorted.
void sift_heap(int* arr, int r, int len)
{
int c, si, swp;
while(r * 2 + 1 < len)
{
c = r * 2 + 1;
si = r;
if(arr[si] < arr[c])
si = c;
if(c < len - 1 && arr[si] < arr[c + 1])
si = c + 1;
if(si != r)
{
swp = arr[r];
arr[r] = arr[si];
arr[si] = swp;
r = si;
}
else
break;
}
}
void heapify(int* arr, int len)
{
int i, r, c, si, swp;
for(i = len / 2 - 1; i >= 0; i--)
{
sift_heap(arr, i, len);
}
}
void heap_sort(int* arr, int len)
{
int i, swp;
heapify(arr, len);
for(i = len; i > 1; i--)
{
swp = arr[0];
arr[0] = arr[i - 1];
arr[i - 1] = swp;
sift_heap(arr, 0, i - 1);
}
}
Quick Sort
Quick sort works by choosing an “pivot” element and moving all elements with a value lower than the pivot value before the pivot and all elements higher than the pivot after. Once the values have been moved, the pivot is in its final position and quick sort is recursively called on each sub list before and after the pivot.
Quick sort has an average complexity of `O(nlogn)` but can have a worst case `o(n^2)` with poorly chosen pivots (for example if the pivot chosen is always the largest or smallest value).
void quick_sort(int* arr, int len)
{
int i, j, p, swp;
if(len <= 1)
return;
p = 0;
j = 1;
for(i = 1; i < len; i++)
{
if(arr[i] < arr[p])
{
swp = arr[i];
arr[i] = arr[j];
arr[j] = swp;
j++;
}
}
swp = arr[p];
arr[p] = arr[j - 1];
arr[j - 1] = swp;
quick_sort(arr, j);
quick_sort(arr + j, len - j);
}
Bucket Sort
Bucket sort organizes the data into buckets that hold a range of values, for example 0-10, 11-20, 21-30, etc… Then each bucket is sorted using a different algorithm, in this case, insertion sort.
void bucket_sort(int* arr, int len)
{
int i, j, k, n, max, **buckets;
max = arr[0];
for(i = 1; i < len; i++)
if(arr[i] > max)
max = arr[i];
n = (int)floor(sqrt(max));
n = max / n;
buckets = (int**)malloc(sizeof(int*) * (n + 1));
memset(buckets, 0, sizeof(int*) * (n + 1));
for(i = 0; i < len; i++)
{
if(buckets[arr[i] / n] == NULL)
{
buckets[arr[i] / n] = (int*)malloc(sizeof(int) * len);
memset(buckets[arr[i] / n], -1, sizeof(int) * len);
}
for(j = 0; buckets[arr[i] / n][j] != -1; j++);
buckets[arr[i] / n][j] = arr[i];
}
k = 0;
for(i = 0; i <= n; i++)
{
if(buckets[i])
{
for(j = 0; buckets[i][j] != -1; j++)
{
arr[k++] = buckets[i][j];
}
free(buckets[i]);
}
}
free(buckets);
insertion_sort(arr, len);
}
]]>