<!--
Source: http://blog.skylined.nl/20161124001.html
Synopsis
A specially crafted web-page can cause a type confusion in HTML layout in Microsoft Internet Explorer 11. An attacker might be able to exploit this issue to execute arbitrary code.
Known affected software and attack vectors
Microsoft Internet Explorer 11
An attacker would need to get a target user to open a specially crafted web-page. Disabling Javascript should prevent an attacker from triggering the vulnerable code path.
Repro.html:
-->
<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=Edge" />
<script>
window.onload = function () {
document.getElementsByTagName("iframe")[0].src = "https://www.exploit-db.com/exploits/40843/repro-iframe.html";
}
</script>
</head>
<body>
<iframe></iframe>
</body>
</html>
<!--
Repro-iframe.html:
<svg><path marker-start="url(#)"><title><q><button>
Description
Internally MSIE uses various lists of linked CTreePos objects to represent the DOM tree. For HTML/SVG elements a CTreeNode element is created, which embeds two CTreePos instances: one that contains information about the first child of the element and one that indicates the next sibling or parent of the element. For text nodes an object containing only one CTreePos is created, as such nodes never have any children. CTreePos instances have various flags set. This includes a flag that indicates if they are the first (fTPBegin) or second (fTPEnd) CTreePos instance for an element, or the only instance for a test node (fTPText).
The CTreePos::Branch method of an CTreePos instance embedded in a CTreeNode can be used to calculate a pointer to the CTreeNode. It determines if the CTreePos instance is the first or second in the CTreeNode by looking at the fTPBegin flag and subtract the offset of this CTreePos object in a CTreeNode object to calculate the address of the later. This method assumes that the CTreePos instance is part of a CTreeNode and not a TextNode. It will yield invalid results when called on the later. In a TextNode, the CTreePos does not have the fTPBegin flag set, so the code assumes this is the second CTreePos instance in a CTreeNode object and subtracts 0x24 from its address to calculate the address of the CTreeNode. Since the CTreePos instance is the first element in a TextNode, the returned address will be 0x24 bytes before the TextNode, pointing to memory that is not part of the object.
Note that this behavior is very similar to another issue I found around the same time, in that that issues also caused the code to access memory 0x24 bytes before the start of a memory region containing an object. Looking back I believe that both issues may have had the same root cause and were fixed at the same time.
The CGeneratedContent::HasGeneratedSVGMarker method walks the DOM using one of the CTreePos linked lists. It looks for any descendant node of an element that has a CTreePos instance with a specific flag set. If found, the CTreePos::Branch method is called to find the related CTreeNode, without checking if the CTreePos is indeed part of a CTreeNode. If a certain flag is set on this CTreeNode, it returns true. Otherwise it continues scanning. If nothing is found, it returns false.
The repro creates a situation where the CGeneratedContent::HasGeneratedSVGMarker method is called on an SVG path element which has a TextNode instance as a descendant with the right flags set to cause it to call CTreePos::Branch on this TextNode. This leads to type confusion/a bad cast where a pointer that points before a TextNode is used as a pointer to a CTreeNode.
Reversed code
While reversing the relevant parts, I created the following pseudo-code to illustrate the issue:
enum eTreePosFlags {
fTPBegin = 0x01, // if set, this is a markup node
fTPEnd = 0x02, // if set, this is a markup node
fTPText =0x04, // if set, this is a markup node
fTPPointer = 0x08, // if set, this is not a markup node
fTPTypeMask =0x0f
fTPLeftChild = 0x10,
fTPLastChild = 0x20, // poNextSiblingOrParent => fTPLastChild ? parent : sibling
fTPData2Pos =0x40, // valid if fTPPointer is set
fTPDataPos = 0x80,
fTPUnknownFlag100 = 0x100, // if set, this is not a markup node
}
struct CTreePos {
/*offs size*/ // THE BELOW ARE BEST GUESSES BASED ON INADEQUATE INFORMATION!!
/*0000 0004*/ eTreePosTypefFlags00;
/*0004 0004*/ UINTuCharsCount04;// Seems to be counting some chars - not sure what exactly
/*0008 0004*/ CTreePos* poFirstChild; // can be NULL if no children exist.
/*000C 0004*/ CTreePos* poNextSiblingOrParent;// fFlags00 & fTPLastChild ? parent end tag : sibling start tag
/*0010 0004*/ CTreePos* poThreadLeft10; // fFlags00 & fTPBegin ? previous sibling or parent : last child or start tag
/*0014 0004*/ CTreePos* poThreadRight14;// fFlags00 & fTPBegin ? first child or end tag :
/*0018 0004*/ flags(0x10 = something with CDATA
/*0028 0004*/
}
struct CTreeNode {
/*offs size*/ // THE BELOW ARE BEST GUESSES BASED ON INADEQUATE INFORMATION!!
/*0000 0004*/ CElement* poElement00;
/*0004 0004*/ CTreeNode*poParent04;
/*0008 0004*/ DWORD dwUnknown08;// flags?
/*000C 0018*/ CTreePosoTreePosBegin0C;// represents the position in the document immediately before the start tag
/*0024 0018*/ CTreePosoTreePosEnd24;// represents the position in the document immediately after the end tag
/*003C ????*/ Unknown
}
struct TextNode { // I did not figure out what this is called in MSIE
/*0000 0018*/ CTreePosoTreePosEnd00;// represents the position in the document immediately after the node.
/*0018 0014*/ Unknown
}
CTreeNode* CTreePos::Branch() {
// Given a pointer to a CTreePos instance in a CTreeNode instance, calculate a pointer to the CTreeNode instance.
// The CTreePos instance must be either the oTreePosBegin0C (oTreePosBegin0C->fFlags00 & fTPBegin != 0) or the
// oTreePosEnd24 (oTreePosEnd24->fFlags00 & fTPEnd != 0).
BOOL bIsTreePosBegin0C = this->fFlags00 & fTPBegin;
INT uOffset = offsetof(CTreeNode, bIsTreePosBegin0C ? oTreePosBegin0C : oTreePosEnd24);
return (CTreeNode*)((BYTE*)this - uOffset);
}
BOOL CGeneratedContent::HasGeneratedSVGMarker() {
for (
CTreePos* poCurrentTreePos = this->oTreePosBegin0C.poThreadRight14,
CTreePos* poEndTreePos = &(this->oTreePosEnd24);
poCurrentTreePos != poEndTreePos;
poCurrentTreePos = poCurrentTreePos->poThreadRight14
) {
if (poCurrentTreePos->fFlags00 & fTPUnknownFlag100) {
// Calling Branch is only valid in the context of CTreePos embedded in a CTreeNode, so the code should check for
// the presence of fTPBegin or fTPEnd in fFlags00 before doing so. This line of code may fix the issue:
// if (poCurrentTreePos->fFlags00 & (fTPBegin | fTPEnd) == 0) continue;
CTreeNode* poTreeNode = poCurrentTreePos->Branch();
if (poTreeNode && poTreeNode->dw64 == 20) {
return 1
}
}
}
return 0
}
DOM Tree
If you replace the <q> tag with an <a> tag in the repro, or insert a <script> tag before the <svg> tag, the repro does not trigger an access violation. At that point it is possible to use document.documentElement.outerHTML as well as recursively walk document.documentElement.childNodes to get an idea of what the DOM tree looks like around the time of the crash.
document.documentElement.outerHTML:
<html>
<head>
</head>
<body>
<svg xmlns="http://www.w3.org/2000/svg">
<path marker-start="url("#")">
<title>
<q>
<button>// no closing tag.
<script>// script is a sibling of button
#text // snipped
</script>
</q>
</title>// Things get really weird here:
</title>
</path> // all svg close tags are doubled!?
</path>
</svg>// Not sure what this means.
</svg>
</body>
</html>
Walking document.documentElement.childNodes:
<html>
<head>
<body>
<svg> // I did not look at attributes
<path>// ^^^ same here
<title>
<q>
<button>
<script>// script is a child of button
#text // snipped
Exploit
I did not find any code path that could lead to exploitation. However, I did not do a thorough step through of the code to find out if and how I might control execution flow upwards in the stack. Also, it appears trivial to have MSIE survive the initial crash by massaging the heap. It might be possible that other methods are affected by a similar issue and that further DOM manipulations can be used to trigger a more interesting code path.
Time-line
July 2014: This vulnerability was found through fuzzing.
September 2014: This vulnerability was submitted to ZDI.
September 2014: This vulnerability appears to have been fixed.
October 2014: This vulnerability was rejected by ZDI.
November 2016: Details of this issue are released.
-->