Ok, so security papers on Hacker News never get that much love, but take my word for it that you want to read this one.
This paper is the moral equivalent of that post from yesterday about how you could derive all of ASCII from []()+! in Javascript. Here, the authors are trying to evade runtime defenses that prevent memory corruption exploits from uploading executable code in target processes. To control execution, they analyze target binaries and synthesize a turing complete runtime out of scraps of legitimate basic blocks.
From 34 instruction sequences, they essentially derive a VM with load/store/move, add/sub/neg/and/or/xor, conditional branches, and call (read the first couple pages and skip ahead to page 12 for the diagram of what "call" actually looks like when you scavenge it out of the bblocks of an unsuspecting program.
Obviously, once you have such a VM defined, you can compile arbitrary exploits to it. It's like having a scriptable debugger attached to the remote program.
You should read the paper because it's an excellent crash course in runtime exploit design and mitigation, and a good way to jog your x86 memory. But I should also note that I haven't done justice to the paper: it's thesis is that you can not only do all these things, but do them without the "ret" instruction, or any other non-innocuous sequence of instructions.
I don't mind reading about programming, startup politics, etc. but things like this paper and the aforementioned javascript ![:alnum:] thread are the things I love the most on HN.
I also love how much really good intelligent discussion happens here on things like PDFs that don't have a built-in commenting system.
Thomas, have you seen or heard of any real life exploits utilizing return oriented programming? Its a really really cool idea, but it seems that once you've got memory corruption, there are easier ways to get arbitrary code execution than return oriented programming.
IANA security expert, but stuff like this makes me believe it is never possible to secure native code. If you want a secure system, build a VM using a strict language like Haskell.
It would be fair to sum up much of security and vulnerability research over the last 15 years as "trying to answer whether it's possible to secure a runtime when an attacker has limited but unanticipated access to raw memory". The weight of the evidence says "no", but secure runtimes are in their infancy (look at Google NaCL for a recent example).
I am a security professional, but not a developer. You're half correct. Your suggestion of a VM with Haskell would result in a system that could still be made to fail. The methodologies would just work differently.
No, because I know absolutely nothing about Haskell. I do know that if there was a silver-bullet language that was 100% "secure", everyone would be using it.
This attack is x86 machine-code-native, so Haskell in a VM is not vulnerable to the exact same problems, but to assume it's completely invulnerable is myopic.
> I do know that if there was a silver-bullet language that was 100% "secure", everyone would be using it.
This is completely false. Almost nobody in the software industry cares about safety. This can be proven trivially just by comparing how much more popular C and C++ are compared to memory-safe languages.
In case you were hoping #2 would lower the price of ECC memory, it won't. ECC memory is the same hardware as regular memory, you just have more of it; 4GB of ECC memory is just 7GB of regular memory with a different sticker on it. Since most people decide how much memory to get based on the price ("I'm willing to spend $100 on memory"), demand would remain unchanged.
Yeah, I know the price wouldn't change much. The real benefit I see is that Intel would have to stop disabling the ECC support on their desktop chips, and I'd have more choices available next time I'm up for a new computer.
You've probably heard the term "buffer overflow". A buffer overflow is one example of a memory corruption flaw. In virtually all modern memory corruption exploits, the goal of the attacker is to corrupt a victim program in such a way that they can upload their own code into the victim and run it.
For about 8 years (95-'03, the "summer of worms"), the bulk of all the work in stopping these attacks was directed at the broken code that allowed the vulnerabilities (for instance, the use of strcpy() and sprintf(), which are unbounded). The payoff for this work went asymptotic.
Starting in the late '90s and really picking up in the '00s, research instead went into hardening runtimes to break exploits. For instance, you can't simply overwrite the activation record for a function to change the return address, because there's a random cookie there. The stack and heap aren't executable, and the text segment isn't writeable. The whole runtime is randomized, so you can't predict addresses.
What papers like this (and, as Dave Aitel pointed out on Twitter, papers like John McDonalds from the 90s -- that guy is spooky) aim to show is that even if attackers can't literally upload x86 instructions into their targets, they can still take control of the process.
That's because even simple real-world programs are very complex at the runtime level, and linked to very complex libraries. The CPU can be coerced into executing any of the instructions in those libraries --- or, indeed, any sequence of bytes comprising any fragment of those instructions.
So, if you do some basic binary analysis on your target, you can craft a sequence of addresses in the target which, if executed in the right order, will have the same effect as any program you could write (it's "turing complete", but a more helpful way to think about it is that they're synthesizing a simple VM running out of fragments of code).
As far as we know, it is very hard to stop these kinds of attacks. An attacker that can write 1-4 bytes to an arbitrary (or even restricted) address in memory can trick the CPU into looking at attacker-controlled memory (say, the 1024 character login name they just gave you), and once the CPU is looking at that memory, it starts running sequences of its own code against itself.
This is systems programming, compiler theory, computer architecture, and computer security. If you're into serious practical CS, you're crazy if you don't consider security as a career. There are very few other jobs that will reliably expose you to so much CS (and even EE) at this level.
Basically, if you have any potential memory attacks in your software (buffer overflows, bad printf format strings, stack smashes, etc), an attacker can make your software do whatever he wants. He can do this without using a RET instruction (which is the more "standard" way of doing this, so people have been investigating looking for it as an active defense).
This paper is the moral equivalent of that post from yesterday about how you could derive all of ASCII from []()+! in Javascript. Here, the authors are trying to evade runtime defenses that prevent memory corruption exploits from uploading executable code in target processes. To control execution, they analyze target binaries and synthesize a turing complete runtime out of scraps of legitimate basic blocks.
From 34 instruction sequences, they essentially derive a VM with load/store/move, add/sub/neg/and/or/xor, conditional branches, and call (read the first couple pages and skip ahead to page 12 for the diagram of what "call" actually looks like when you scavenge it out of the bblocks of an unsuspecting program.
Obviously, once you have such a VM defined, you can compile arbitrary exploits to it. It's like having a scriptable debugger attached to the remote program.
You should read the paper because it's an excellent crash course in runtime exploit design and mitigation, and a good way to jog your x86 memory. But I should also note that I haven't done justice to the paper: it's thesis is that you can not only do all these things, but do them without the "ret" instruction, or any other non-innocuous sequence of instructions.