Source: https://code.google.com/p/google-security-research/issues/detail?id=224&can=1&q=label%3AProduct-Flash%20modified-after%3A2015%2F8%2F17&sort=id
There’s an error in the PCRE engine version used in Flash that allows the execution of arbitrary PCRE bytecode, with potential for memory corruption and RCE.
This issue is a duplicate of http://bugs.exim.org/show_bug.cgi?id=1546 originally reported to PCRE upstream by mikispag; I rediscovered the issue fuzzing Flash so have filed this bug report to track disclosure deadline for Adobe.
The issue occurs in the handling of zero-length assertions; ie assertions where the object of the assertion is prepended with the OP_BRAZERO operator.
Simplest testcase that will crash in an ASAN build is the following:
(?(?<a>)?)
This is pretty much a nonsense expression, and I'm not sure why it compiles successfully; but it corresponds to the statement that 'assert that named group 'a' optionally matches'; which is tautologically true regardless of 'a'.
Regardless, we emit the following bytecode:
0000 5d001293 BRA[18]
0003 5f000c95 COND [12]
0006 66 102 BRAZERO
0007 5e0005000194 CBRA [5, 1]
000c 54000584 KET[5]
000f 54000c84 KET[12]
0012 54001284 KET[18]
0015 00 0 END
When this is executed, we reach the following code:
/* The condition is an assertion. Call match() to evaluate it - setting
the final argument match_condassert causes it to stop at the end of an
assertion. */
else
{
RMATCH(eptr, ecode + 1 + LINK_SIZE, offset_top, md, ims, NULL,
match_condassert, RM3);
if (rrc == MATCH_MATCH)
{
condition = TRUE;
ecode += 1 + LINK_SIZE + GET(ecode, LINK_SIZE + 2);
while (*ecode == OP_ALT) ecode += GET(ecode, 1);<---- ecode is out of bounds at this point.
If we look at the execution trace for this expression, we can see where this code goes wrong:
exec 0x600e0000dfe4 93 [0x60040000dfd0 41]
exec 0x600e0000dfe7 95 [0x60040000dfd0 41]
exec 0x600e0000dfea 102 [0x60040000dfd0 41] <--- RMATCH recursive match
exec 0x600e0000dfeb 94 [0x60040000dfd0 41]
exec 0x600e0000dff0 84 [0x60040000dfd0 41]
exec 0x600e0000dff3 84 [0x60040000dfd0 41]
exec 0x600e0000dff6 84 [0x60040000dfd0 41]
exec 0x600e0000dff9 0 [0x60040000dfd0 41] <--- recursive match returns
before 0x600e0000dfe7 24067 <--- ecode == 0x...dfe7
after 0x600e00013dea
If we look at the start base for our regex, it was based at dfe4; so dfe7 is the OP_COND, as expected. Looking at the next block of code, we're clearly expecting the assertion to be followed by a group; likely OP_CBRA or another opcode that has a 16-bit length field following the opcode byte.
ecode += 1 + LINK_SIZE + GET(ecode, LINK_SIZE + 2);
In this case, the insertion of the OP_BRAZERO has resulted in the expected OP_CBRA being shifted forward by a byte to 0x...dfeb; and this GET results in the value of 0x5e00 + 1 + LINK_SIZE being added to the ecode pointer, instead of the correct 0x0005 + 1 + LINK_SIZE, resulting in bytecode execution hopping outside of the allocated heap buffer.
See attached for a crash PoC for the latest Chrome/Flash on x64 linux.
https://gitlab.com/exploit-database/exploitdb-bin-sploits/-/raw/main/bin-sploits/37839.zip