Return-Oriented Programming without Returns (on the x86)

tptacek · on Feb 27, 2010

Ok, so security papers on Hacker News never get that much love, but take my word for it that you want to read this one.

This paper is the moral equivalent of that post from yesterday about how you could derive all of ASCII from []()+! in Javascript. Here, the authors are trying to evade runtime defenses that prevent memory corruption exploits from uploading executable code in target processes. To control execution, they analyze target binaries and synthesize a turing complete runtime out of scraps of legitimate basic blocks.

From 34 instruction sequences, they essentially derive a VM with load/store/move, add/sub/neg/and/or/xor, conditional branches, and call (read the first couple pages and skip ahead to page 12 for the diagram of what "call" actually looks like when you scavenge it out of the bblocks of an unsuspecting program.

Obviously, once you have such a VM defined, you can compile arbitrary exploits to it. It's like having a scriptable debugger attached to the remote program.

You should read the paper because it's an excellent crash course in runtime exploit design and mitigation, and a good way to jog your x86 memory. But I should also note that I haven't done justice to the paper: it's thesis is that you can not only do all these things, but do them without the "ret" instruction, or any other non-innocuous sequence of instructions.

ax0n · on Feb 27, 2010

I don't mind reading about programming, startup politics, etc. but things like this paper and the aforementioned javascript ![:alnum:] thread are the things I love the most on HN.

I also love how much really good intelligent discussion happens here on things like PDFs that don't have a built-in commenting system.

So, thanks!

bigmac · on Feb 28, 2010

Thomas, have you seen or heard of any real life exploits utilizing return oriented programming? Its a really really cool idea, but it seems that once you've got memory corruption, there are easier ways to get arbitrary code execution than return oriented programming.

__bjoernd · on Feb 28, 2010

They definitely showed that this approach can be used in a real-life scenario last year, when they broke some voting machines using ROP.

http://www.scienceblog.com/cms/computer-scientists-take-over...

swolchok · on Feb 28, 2010

It's been done on sensor motes too. http://planete.inrialpes.fr/~francill/Papers/Code_Injection_...

It's not surprising that it's practical; it's basically just a souped up ret2libc attack.

arohner · on Feb 27, 2010

IANA security expert, but stuff like this makes me believe it is never possible to secure native code. If you want a secure system, build a VM using a strict language like Haskell.

Is that a fair statement?

tptacek · on Feb 28, 2010

It would be fair to sum up much of security and vulnerability research over the last 15 years as "trying to answer whether it's possible to secure a runtime when an attacker has limited but unanticipated access to raw memory". The weight of the evidence says "no", but secure runtimes are in their infancy (look at Google NaCL for a recent example).

ax0n · on Feb 27, 2010

I am a security professional, but not a developer. You're half correct. Your suggestion of a VM with Haskell would result in a system that could still be made to fail. The methodologies would just work differently.

ramchip · on Feb 28, 2010

Could you back up that claim? lambdabot on #haskell runs arbitrary Haskell code very safely.

jrockway · on Feb 28, 2010

I think he means fail as in allowing you to write a function like:

  deleteAllFiles user = if user == Root
                           then throwError "access denied"
                           else unsafeDeleteAllFiles

Even if memory access is safe, you can still type in the wrong algorithm.

ax0n · on Feb 28, 2010

Most certainly. But further, it's possible to write unintentionally exploitable code in any language.

ax0n · on Feb 28, 2010

No, because I know absolutely nothing about Haskell. I do know that if there was a silver-bullet language that was 100% "secure", everyone would be using it.

This attack is x86 machine-code-native, so Haskell in a VM is not vulnerable to the exact same problems, but to assume it's completely invulnerable is myopic.

jpr · on Feb 28, 2010

> I do know that if there was a silver-bullet language that was 100% "secure", everyone would be using it.

This is completely false. Almost nobody in the software industry cares about safety. This can be proven trivially just by comparing how much more popular C and C++ are compared to memory-safe languages.

tedunangst · on Feb 28, 2010

If you want a secure system, you'll have to answer how is the secure Haskell VM different than the "secure" Java VM, or Python VM, or Flash VM?

jules · on Feb 28, 2010

No, you only need memory safety. Other than that you can make your language as crazy as you want.

yacin · on Feb 28, 2010

Memory safety _can_ be compromised in a cool, albeit unlikely, scenario: http://www.cs.princeton.edu/~sudhakar/papers/memerr.pdf

But for all practical purposes, you're right :)

tbrownaw · on Feb 28, 2010

1) wow, that is seriously awesome

2) hey, maybe we can get the music and film industry associations to mandate ECC as part of the license requirements for their next big DRM scheme

jrockway · on Feb 28, 2010

In case you were hoping #2 would lower the price of ECC memory, it won't. ECC memory is the same hardware as regular memory, you just have more of it; 4GB of ECC memory is just 7GB of regular memory with a different sticker on it. Since most people decide how much memory to get based on the price ("I'm willing to spend $100 on memory"), demand would remain unchanged.

tbrownaw · on March 3, 2010

Yeah, I know the price wouldn't change much. The real benefit I see is that Intel would have to stop disabling the ECC support on their desktop chips, and I'd have more choices available next time I'm up for a new computer.

jules · on Feb 28, 2010

Very cool! Purity doesn't help defend against that either, unfortunately.

dx · on Feb 28, 2010

Can somebody describe what this document is all about in simple words, for the average "high level" developer? Thanks in advance!

tptacek · on Feb 28, 2010

You've probably heard the term "buffer overflow". A buffer overflow is one example of a memory corruption flaw. In virtually all modern memory corruption exploits, the goal of the attacker is to corrupt a victim program in such a way that they can upload their own code into the victim and run it.

For about 8 years (95-'03, the "summer of worms"), the bulk of all the work in stopping these attacks was directed at the broken code that allowed the vulnerabilities (for instance, the use of strcpy() and sprintf(), which are unbounded). The payoff for this work went asymptotic.

Starting in the late '90s and really picking up in the '00s, research instead went into hardening runtimes to break exploits. For instance, you can't simply overwrite the activation record for a function to change the return address, because there's a random cookie there. The stack and heap aren't executable, and the text segment isn't writeable. The whole runtime is randomized, so you can't predict addresses.

What papers like this (and, as Dave Aitel pointed out on Twitter, papers like John McDonalds from the 90s -- that guy is spooky) aim to show is that even if attackers can't literally upload x86 instructions into their targets, they can still take control of the process.

That's because even simple real-world programs are very complex at the runtime level, and linked to very complex libraries. The CPU can be coerced into executing any of the instructions in those libraries --- or, indeed, any sequence of bytes comprising any fragment of those instructions.

So, if you do some basic binary analysis on your target, you can craft a sequence of addresses in the target which, if executed in the right order, will have the same effect as any program you could write (it's "turing complete", but a more helpful way to think about it is that they're synthesizing a simple VM running out of fragments of code).

As far as we know, it is very hard to stop these kinds of attacks. An attacker that can write 1-4 bytes to an arbitrary (or even restricted) address in memory can trick the CPU into looking at attacker-controlled memory (say, the 1024 character login name they just gave you), and once the CPU is looking at that memory, it starts running sequences of its own code against itself.

This is systems programming, compiler theory, computer architecture, and computer security. If you're into serious practical CS, you're crazy if you don't consider security as a career. There are very few other jobs that will reliably expose you to so much CS (and even EE) at this level.

amalcon · on Feb 28, 2010

Basically, if you have any potential memory attacks in your software (buffer overflows, bad printf format strings, stack smashes, etc), an attacker can make your software do whatever he wants. He can do this without using a RET instruction (which is the more "standard" way of doing this, so people have been investigating looking for it as an active defense).