ASU CSE 466

Homework 4 is over! It was due at 12:00pm MST on Wednesday 9/19/18. At exactly 12:00pm, the homework server became inaccessible.

Homework 4 is an introduction to exploitation of memory corruption vulnerabilities in binary code. Because of the lack of memory safety in low-level languages, such as C, memory corruption vulnerabilities manifest quite frequently, to brutal effect. You will explore a number of different exploitation scenarios, using different types of flaws to get flags.

In terms of submission and so forth, Homework 4 is based on the same high-level concept as Homeworks 1, 2, and 3. There is a /flag file, and you get to choose one binary on which the SUID flag will be set. Like homework 2, the binaries that you are allowed to choose are all in the /pwn directory.

This gives you exactly 33 targets. There are 11 difficulty levels, denoted by operating system names in increasing lexical ordering – “android” is the set of easiest programs, and “windows” is the set of hardest. Each program takes user input on stdin and contains at least one (intentional) vulnerability. If you exploit it, you can get it to read the flag and print it out to you.

The up-shot is this: to read the /flag for a binary, you will have to understand how to exploit it.

For this assignment, each flag will earn three points. This means that you can get on the curve by making it through 24 challenges. After 70 points (23.33333333 flags), you will be graded on a curve. Read the syllabus.html the full details of the grading system.

Collaboration Policy

Every student has a unique set of challenges, generated specifically for them. Thus, collaboration is tricky but also more controllable. The policy for this homework is this: you may help a fellow student on exactly one of their challenges.

Accessing Homework 4

The password has been emailed to the class mailing list, and you can check it in the archives.

You can access HW3 using netcat, or a similar program:

nc cse466.pwn.college 23

For scriptable interaction, the same script as for homeworks 1 and 2 will work for homework 3.

Again, you will find that netcat gives you a terminal that is not conducive to using terminal programs such as nano, vi, and so on. This is a solvable problem! As an alternative to netcat, you can use socat, the swiss army knife of socket operations:

socat tcp:cse466.pwn.college:23 FILE:`tty`,raw,echo=0,icrnl=1

This will give you a “raw” terminal, allowing you to use key combinations like Ctrl-C, Ctrl-Z, Ctrl-X, and so forth. This means that, to terminate socat (since you cannot do it with Ctrl-C), you will need disconnect cleanly (exit the session) or kill it from another shell.

The caveat here is that the wrapper (the stuff before your container actually launches) is not written with this in mind. Because of this, your input to it will not be echoed to your terminal, and its output will have the newlines screwed up. There are good technical reasons for this, and if you are very interested, read the socat and stty man pages (the latter has an explanation of what echo=0 and icrnl=1 mean). It is quite enlightening, but off-topic for this assignment. At any rate, even though your input is not echoed back to you, it will be received by the program, and once you actually get put into your container, things will work normally and you will be able to use whatever applications you want.

Exfiltrating the challenges

Doing your analysis on the target environment is not a good idea. You don’t have access to a lot of critical functionality, such as ptrace (used by gdb and ltrace). We strongly recommend exfiltrating your challenges to your Linux machine and tackling them there.

You can exfiltrate them using scp or nc to connect to a remote server, or by doing the following:

tar zcf /tmp/pwn.tar.gz /pwn; wget --method PUT --body-file=/tmp/pwn.tar.gz https://transfer.sh/pwn.tar.gz -O - -nv

That will spit out a link (in somewhat of a poor format; just ignore the date part) that you can download the tarball from.

There are other ways to exfiltrate data, and, quite frankly, you should be able to do them. A few other ideas:

Here is a bash script that will do the latter if run from your Linux box (not from within the homework container). How it works is an exercize that I will leave to you.

#!/bin/bash

read -p "Homework password: " PASSWORD
read -p "Hacker handle: " HANDLE
read -p "ASURITE: " ASURITE
read -p "Path to any challenge: " CHALLENGE

(
	echo $PASSWORD; sleep 0.1
	echo $HANDLE; sleep 0.1;
	echo $ASURITE; sleep 0.1;
	echo 2; sleep 0.1;
	echo $CHALLENGE; sleep 0.1;
	echo "tar cz /pwn | base64"; sleep 0.1
	echo "exit"; sleep 2
	echo "noflag"; sleep 0.1
	echo "4"; sleep 0.1
) | nc cse466.pwn.college 23 | grep -A1000000000 "tar:" | grep -B1000000000 "logout" | tail -n+2 | head -n-2 | tr -d '\r' | base64 -d | tar xvz

Good luck!

What tools are useful?

The tools useful in this assignment are similar to the tools useful in homework 3, with a larger emphasis on manual reverse-engineering to understand the flaw. You will need to spend a lot of time in gdb and objdump.

Is there a magic bullet?

Not so much, this time. This homework requires grit and determination!

CONCEPT: null-terminated strings

One concept we didn’t have time to fully explore in classs non-terminated strings. Consider the following code:

int main()
{
	char name[16] = {0};

	printf("Name: ");
	read(0, name, 128);
	printf("Hello %s!\n", name);

	char color[16] = {0};
	printf("Favorite color: ");
	read(0, color, 128);
}

Obviously, this code is insecure, because we read in way more bytes into name and color than they can hold. However, stack canaries offer some protection against this: because we have to clobber the canary on the way to overwriting the return address, we cannot actually exploit without knowing the canary (so that we overwrite it with the exact value that is currently there).

Luckily, we can leak the canary! In C, strings are null-terminated. This means that, rather than explictly storing a size (such as what is done by Python strings, C++ strings, and so on), they are simply assumed to keep going until the first NULL byte (i.e., 0x00). If the user input in this example is less than 16, there is no problem, because char name[16] = {0}; initializes the array to all NULL bytes, and if the read grabs fewer than 16 characters, the remaining bytes will remain NULL, and the string will be NULL terminated.

What if the user inputs 16 bytes or more? In this case, whatever is after name on the stack (such as the canary value) will be leaked, because printf will just keep printing until it hits a NULL byte to terminate what it thinks is name. We can see an example of this below. In this example, we omit the frame pointer, because that tends to be the default nowadays.

$ gcc -fomit-frame-pointer no-null.c -o no-null
$ echo -n "AAAABBBBCCCCDDDD111122223333444455556666X" | stdbuf -o0 ./no-null | hd
00000000  4e 61 6d 65 3a 20 48 65  6c 6c 6f 20 41 41 41 41  |Name: Hello AAAA|
00000010  42 42 42 42 43 43 43 43  44 44 44 44 31 31 31 31  |BBBBCCCCDDDD1111|
00000020  32 32 32 32 33 33 33 33  34 34 34 34 35 35 35 35  |2222333344445555|
00000030  36 36 36 36 58 74 e5 73  cb 81 15 b5 21 0a 46 61  |6666Xt.s....!.Fa|
00000040  76 6f 72 69 74 65 20 63  6f 6c 6f 72 3a 20        |vorite color: |

A few things:

  1. We use echo -n so that echo does not implicitly print a newline. It is always better to explicitly understand exactly what input you are providing to a program while trying to exploit it.
  2. We use stdbuf to disable output buffering. This lets us use hd without waiting for the program to explicitly flush its output buffer.
  3. We use hd to interpret the hex, rather than have the program output to us directly.
  4. We might expect to only wrote 32 bytes (the size of name plus the size of color) before hitting the canary, but stack padding (the stack is always aligned to 0x10) necessitates a further write. It takes 40 bytes to reach the canary.

Even with this, why do we write that extra “X” bytes? This is necessary because the least significant byte of the canary (which is the leftmost byte in memory) is a NULL byte, specifically to evade these sorts of leaks via non-terminated string output. In this case, we have a powerful enough vulnerability to overwrite just that NULL byte, which will lead to the rest of the canary being printed. In the above case, the canary is 0xb51581cb73e57400 (in little endian)! Now, for the color buffer overflow, we can write the canary back into its place and fully control the return address!

CONCEPT: brute-forcing static memory

Sometimes, there is static memory that you need to leak, and you don’t have a direct leak. A specific application of this is forking programs. Some classes of network applications fork (i.e., copy the process) on every connection. Interestingly, canaries are randomized only when the process starts, but not when it forks. Because forked processes are independent of each other, you can experiment with vulnerabilities in the child process (and overwrite the canary with incorrect values, leading to process termination) without adversely affecting the parent.

Aside from network services, this also happens with Android applications. On Android, every process is forked off of a common process called the Zygote. This weird setup causes all the canaries to be the same, so if you can leak one, you will know them all.

Homework 4 has some customized challenges that use part of the flag for the canary, meaning that the canary is static per binary, allowing you to treat it like one of these cases and brute-force the canary.

So how do you do it? If you overwite the whole canary, you have a 1 out of 72057594037927936 chance of randomly guessing the right value (1/2^56, rather than 1/2^64, because the LSB is always NULL). But, you can do it byte by byte! Consider the leftmost byte (which we know is NULL, or 0x00): if you overwrite it with an incorrect value, the canary check will fail. If you overwrite it with 0, it will succeed. Since you know the leftmost/LSB byte, you can begin to attack the next byte. You can guess a value, see if the process aborts with a canary fail, and if it does, try another value. Since you are brute-forcing one byte, this should take at most 256 tries. Then you move on to the next one. The entire canary can be brute-forced byte-by-byte in 256*7 (1792) tries.

Of course, this only works with static canaries (or other static data that you need to leak without actual output).

Other resources

There are many resources related to reverse engineering around the internet. A good place to start is a series of walkthroughs of several hacking challenges by ASU’s own Adam Doupe on his Youtube channel.