Sunday, August 26, 2012

My BlueHat Prize entry: ROPGuard - runtime prevention of return-oriented programming attacks

In my previous (brief) post I was very excited to announce that ROPGuard, a system for runtime prevention of return-oriented programming attacks that I designed, was selected as one of the top three entries in the Microsoft's BlueHat Prize contest. I also promised to publish more details about my entry.

In the meantime, at Microsoft's party following the Black Hat USA 2012 conference, the winners of the contest were announced and my project was awarded a second prize, which I'm very happy about. In case you missed the party, you can see the recording of the awarding ceremony here. Once again, congratulations to Vasilis Pappas for winning the contest!

I planned to write more about ROPGuard sooner, but it was a really eventful month for me (more on that in a later post) and so this always got pushed back. Now, finally, I'm happy to announce that I published the entire project on Google Code. You can find all of the source and project files as well as the project documentation here.

In the remainder of this post, I'll write a brief retrospect of my project and give some thoughts on what I intended to accomplish, what problems I faced, what went right, what went wrong etc.

First of all, I'd like to say a few words about the problem I attempted to solve: Return-Oriented Programming (ROP). It is the current state of the art technique used in exploit development to bypass Data Execution Protection (DEP). Preventing ROP is very difficult - I couldn't claim it's the most difficult problem of those that would be applicable in the contest as I haven't actually attempted to solve any of the others, but it's certainly very difficult and certainly a burning issue in the exploit prevention. Despite all of my efforts, as well as the efforts of other BlueHat Prize finalists and other researcher, the problem is still open.

In general, previously proposed solutions for ROP fall in one of several groups, each with its drawbacks. Firstly, there are solutions based on the dynamic instrumentation. The drawback of these solutions is that they add a significant performance overhead. The second group are the compiler-based solutions. Although there are some really comprehensive compiler-based solutions, in general, for the compiler-level solution to be applied, we need to have the source code of each of the executable modules and the solution needs to be applied to every module in the process in order for the protection to work. And finally, there are solutions based on static binary rewriting, the drawback of these approaches being their reliance on automatic reverse engineering, which is error prone and any error will most likely affect the stability of the protected application.

Instead of following one of these approaches, what I attempted to do is use as much information as possible already existing in the running process to try to differentiate ROP attempts from normal code execution. One of the main ideas behind ROPGuard is that, in order to be able to execute arbitrary code or escape the current process, the attacker must call one from the finite set of functions from the ROP code. I called these functions "critical functions". For example, to execute arbitrary code, the commonly used ROP chains call functions such as VirtualProtect and VirtualAlloc, but others such as HeapCreate or LoadLibrary can be used as well etc. This leads to the idea to run the checks only when one of these functions gets called to check if it was called from the ROP code or as a result of normal program execution. This way, a protective system with a very low computational overhead can be created.

Now let's look at the checks that ROPGuard performs. There are many simple checks such as:
 - checking whether the value of the stack pointer is valid
 - checking if the address of the current function is on the stack (which could mean that it was entered via RETN instead of CALL)
 - checking if the instruction preceding the one at the return address of the critical function is a CALL instruction
 - some function-specific checks, such as checking if the VirtualProtect function is called to change the access rights of the stack

But besides those relatively simple checks, the other major idea was to check if there is a valid call stack leading to the execution of the current (critical) function.

When one thinks of checking the call stack, the first idea that comes to mind is using the frame pointer (EBP register) to do it. After all, that's how the call stack is usually reconstructed. However, in order for this to work the program must use EBP register as a frame pointer and this is in no way mandatory - many programs omit frame pointers in favor of using the EBP register as an additional general-purpose register. So, in conclusion, call stack check based on the frame pointers could be used to check the validity of the call stack, but only if the program is compiled in such a way that frame pointers are not omitted. ROPGuard has an option to perform these checks and they can be turned on for programs compiled in such a way for greater protection (another good thing about ROPGuard is its configurability).

But, in general, an alternative to frame pointes was needed to reconstruct the call stack. My initial idea to do that was as follows: starting at the current instruction we trace the instructions until we reach the RETN instruction. On the way, we also simulate all instructions that modify the stack pointer (such as PUSH, POP, ADD ESP,# etc.) and update the information about the size of the current stack frame. On RETN we perform the appropriate checks and continue again from the return address (see the original documentation for details). In general, there are several problems with this idea. The first one is functions that dynamically determine the frame size, but such functions are relatively scarce in most programs. The second problem is the stdcall calling convention (and possibly other calling conventions where the callee is responsible for cleaning the stack). The functions that follow these conventions will actually modify the frame size of the caller function. So when we reach the CALL instruction, in general we can't know how the stack pointer will be modified after the call. Sure, we could follow into the function and look for the RETN n instruction, but there's another problem: indirect calls. When we reach an indirect call, unless we simulate the entire instruction set, we won't know the CALL target and so we also won't know how this call will modify the stack pointer. ROPGuard takes a very simple approach around this - it will break the simulation on CALL instructions and other instructions where it can't properly resolve the stack pointer. However, if you look at the commonly used ROP chains such as those in, you'll notice that they actually don't usually use indirect calls except for calling functions that we may consider critical functions and we already got those covered (as all checks are called again on every critical function call). So ROPGuard will easily detect the currently used ROP chains using this approach. Note that another approach could be to simulate the entire instruction set instead of only the instructions that modify the stack pointer, however, this would be much more expensive in terms of system resources.

So, to conclude, ROPGuard should not be considered a complete solution for ROP, but can easily detect currently used ROP chains. Additionally, one of the best features of ROPGuard is its practicality: It has a very low computational (about 0.5% on average) and memory overhead and can be applied at runtime to any process, even if this process is already running. These, among with the relative simplicity of integrating ROPGuard into other tools, are probably the reasons that, among all of the other entries, Microsoft decided to implement techniques used in ROPGuard into their EMET tool.

I'd also like to thank Elias Bachaalany of Microsoft for not only implementing ROPGuard technology into EMET but also improving it in every way. In fact, due to various reliability improvements I recommend using EMET instead of my own prototype implementation.

I am aware that ROPGuard is not perfect and if the attacker is aware of ROPGuard and has enough skill, he/she would be able to construct special ROP chains that would, so to speak, push ROPGuard off guard. In fact, this protection has already been defeated once. Although this particular attack can be prevented by implementing the same protections for the system calls as well as for user-mode functions, there are other ways to bypass the proposed checks. However, as attack techniques improve, so do the defenses and I hope that my work, in addition to other existing defense-oriented research, will help others to come up with better ideas. Personally, despite all my experience in offensive-oriented security research, I'd be much more impressed if someone came up with a way to improve the techniques described in my work rather than a way to break them.