1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 |
This vulnerability relies on several minor oversights in the handling of shading patterns in pdfium, I'll try to detail all of the issues that could be fixed to harden the code against similar issues. The DrawXShading functions in cpdf_renderstatus.cpp rely on a helper function to compute the number of output components resulting from applying multiple shading functions. Note that all of these functions appear to be vulnerable; the rest of this report discusses the specifics of triggering a heap-overflow using DrawRadialShading. uint32_t CountOutputs( const std::vector<std::unique_ptr<CPDF_Function>>& funcs) { uint32_t total = 0; for (const auto& func : funcs) { if (func) total += func->CountOutputs(); // <--Issue #1 : integer overflow here } return total; } The lack of integer overflow checking would not be an issue if the parser enforced the limitations applied by the pdf specification to the functions applied (namely that the /Function section in a radial shading pattern should be either a 1-n function or an array of n 1-1 functions), as these preconditions would preclude any overflow from occuring. However, we can see in the loading code for CPDF_ShadingPattern that there is no such validation. bool CPDF_ShadingPattern::Load() { if (m_ShadingType != kInvalidShading) return true; CPDF_Dictionary* pShadingDict = m_pShadingObj ? m_pShadingObj->GetDict() : nullptr; if (!pShadingDict) return false; m_pFunctions.clear(); CPDF_Object* pFunc = pShadingDict->GetDirectObjectFor("Function"); if (pFunc) { // Issue #2: we never validate that the signatures of the parsed Function object // match the expected signatures for the shading type that we're parsing. if (CPDF_Array* pArray = pFunc->AsArray()) { m_pFunctions.resize(std::min<size_t>(pArray->GetCount(), 4)); for (size_t i = 0; i < m_pFunctions.size(); ++i) m_pFunctions[i] = CPDF_Function::Load(pArray->GetDirectObjectAt(i)); } else { m_pFunctions.push_back(CPDF_Function::Load(pFunc)); } } CPDF_Object* pCSObj = pShadingDict->GetDirectObjectFor("ColorSpace"); if (!pCSObj) return false; CPDF_DocPageData* pDocPageData = document()->GetPageData(); m_pCS = pDocPageData->GetColorSpace(pCSObj, nullptr); if (m_pCS) m_pCountedCS = pDocPageData->FindColorSpacePtr(m_pCS->GetArray()); m_ShadingType = ToShadingType(pShadingDict->GetIntegerFor("ShadingType")); // We expect to have a stream if our shading type is a mesh. if (IsMeshShading() && !ToStream(m_pShadingObj.Get())) return false; return true; } Assuming that we can create function objects with very large output sizes, we can then reach the following code (in cpdf_renderstatus.cpp) when rendering something using the pattern: void DrawRadialShading(const RetainPtr<CFX_DIBitmap>& pBitmap, CFX_Matrix* pObject2Bitmap, CPDF_Dictionary* pDict, const std::vector<std::unique_ptr<CPDF_Function>>& funcs, CPDF_ColorSpace* pCS, int alpha) { // ... snip ... uint32_t total_results = std::max(CountOutputs(funcs), pCS->CountComponents()); // NB: CountOutputs overflows here, result_array will be a stack buffer if we return // a resulting size less than 16) or a heap buffer if the size is larger. CFX_FixedBufGrow<float, 16> result_array(total_results); float* pResults = result_array; memset(pResults, 0, total_results * sizeof(float)); uint32_t rgb_array[SHADING_STEPS]; for (int i = 0; i < SHADING_STEPS; i++) { float input = (t_max - t_min) * i / SHADING_STEPS + t_min; int offset = 0; for (const auto& func : funcs) { if (func) { int nresults; // Here we've desynchronised the size of the memory pointed to by // pResults with the actual output size of the functions, so this // can write outside the allocated buffer. if (func->Call(&input, 1, pResults + offset, &nresults)) offset += nresults; } } float R = 0.0f; float G = 0.0f; float B = 0.0f; pCS->GetRGB(pResults, &R, &G, &B); rgb_array[i] = FXARGB_TODIB(FXARGB_MAKE(alpha, FXSYS_round(R * 255), FXSYS_round(G * 255), FXSYS_round(B * 255))); } Now we need to revisit our earlier assumption, that we can create function objects with large output sizes. The following code handles parsing of function objects: bool CPDF_Function::Init(CPDF_Object* pObj) { CPDF_Stream* pStream = pObj->AsStream(); CPDF_Dictionary* pDict = pStream ? pStream->GetDict() : pObj->AsDictionary(); CPDF_Array* pDomains = pDict->GetArrayFor("Domain"); if (!pDomains) return false; m_nInputs = pDomains->GetCount() / 2; if (m_nInputs == 0) return false; m_pDomains = FX_Alloc2D(float, m_nInputs, 2); for (uint32_t i = 0; i < m_nInputs * 2; i++) { m_pDomains[i] = pDomains->GetFloatAt(i); } CPDF_Array* pRanges = pDict->GetArrayFor("Range"); m_nOutputs = 0; if (pRanges) { m_nOutputs = pRanges->GetCount() / 2; m_pRanges = FX_Alloc2D(float, m_nOutputs, 2); // <-- avoid this call for (uint32_t i = 0; i < m_nOutputs * 2; i++) m_pRanges[i] = pRanges->GetFloatAt(i); } uint32_t old_outputs = m_nOutputs; if (!v_Init(pObj)) return false; if (m_pRanges && m_nOutputs > old_outputs) { m_pRanges = FX_Realloc(float, m_pRanges, m_nOutputs * 2); // <-- avoid this call if (m_pRanges) { memset(m_pRanges + (old_outputs * 2), 0, sizeof(float) * (m_nOutputs - old_outputs) * 2); } } return true; } We can only have 4 functions, so we need m_nOutputs to be pretty large. Ideally we also don't want our pdf file to contain arrays of size 0x100000000 // 4 either, since this will mean multiple-gigabyte pdfs. Note also that any call to the FX_ allocation functions will fail on very large values, so ideally we need to follow the case old_outputs == m_nOutputs == 0, avoiding the final FX_Realloc call and allowing an arbitrarily large m_nOutputs. It turns out that there is a function subtype that allows this, the exponential interpolation function type implemented in cpdf_expintfunc.cpp bool CPDF_ExpIntFunc::v_Init(CPDF_Object* pObj) { CPDF_Dictionary* pDict = pObj->GetDict(); if (!pDict) return false; CPDF_Array* pArray0 = pDict->GetArrayFor("C0"); if (m_nOutputs == 0) { m_nOutputs = 1; if (pArray0) { fprintf(stderr, "C0 %zu\n", pArray0->GetCount()); m_nOutputs = pArray0->GetCount(); } } CPDF_Array* pArray1 = pDict->GetArrayFor("C1"); m_pBeginValues = FX_Alloc2D(float, m_nOutputs, 2); m_pEndValues = FX_Alloc2D(float, m_nOutputs, 2); for (uint32_t i = 0; i < m_nOutputs; i++) { m_pBeginValues[i] = pArray0 ? pArray0->GetFloatAt(i) : 0.0f; m_pEndValues[i] = pArray1 ? pArray1->GetFloatAt(i) : 1.0f; } m_Exponent = pDict->GetFloatFor("N"); m_nOrigOutputs = m_nOutputs; if (m_nOutputs && m_nInputs > INT_MAX / m_nOutputs) // <-- can't be *too* large return false; m_nOutputs *= m_nInputs; // <-- but it can be pretty large // Issue #3: This is probably not the place, but it probably makes sense to // bound m_nInputs and m_nOutputs to some large-but-not-that-large value in // CPDF_Function::Init return true; } So, by providing a function object without a /Range object, but with a large /C0 and /Domain elements, we can construct a function object with about INT_MAX outputs. 7 0 obj << /FunctionType 2 /Domain [ 0.0 1.0 ... repeat many times ... 0.0 1.0 ] /C0 [ 0.0 ... repeat many times ... 0.0 ] /N 1 >> endobj At this point it looks like we have quite an annoying exploitation primitive; we can write a huge amount of data out of bounds, but that data will be calculated as an interpolation between it's input coordinates, and it will be a really, really big memory corruption. It turns out that the point mentioned earlier at Issue #2 about validating the signatures of the functions is again relevant here, since if we look at the callsite in DrawRadialShading we can see that all of the functions are called with a single input parameter, and if we look at the start of CPDF_Function::Call bool CPDF_Function::Call(float* inputs, uint32_t ninputs, float* results, int* nresults) const { if (m_nInputs != ninputs) return false; We can see that any attempt to call a function with the wrong number of input parameters will simply fail, letting us control precisely the size and contents of our overflow. The attached poc will crash under ASAN with the following stack-trace, and without ASAN during the free of the corrupted heap block. Proof of Concept: https://gitlab.com/exploit-database/exploitdb-bin-sploits/-/raw/main/bin-sploits/44082.zip |