Reputation: 31
So I got this one out of a string-sanitizing/xss-filter class buried somewhere deep inside a module of the system I'm working with.
The client called in and said "my form isn't posting when I put <> signs in it". Tried it, worked. He then send me the specific text, that was to be submitted and presto! I've got the same error.
Rocking my XDebug, I isolated the problem to occur exactly at the aforementioned string-sanitizing class when it's trying to remove typical JavaScript event-handlers from the text by using preg_replace
.
The code goes something like $str = preg_replace($prettyLongRegex, $replaceMent, $str);
Naturally, I expect $str
to just stay as it was if nothing matches. What happens though, is that the whole expression returns NULL
, effectively making my $str
an empty string.
Boom, the text is gone, the form doesn't validate, the error is being printed…
While fiddling around with the parameters, I realized that another text i.e. $str
or less handlers within the text work out just fine. It is this exact constellation which is leading to my problem.
My question: Why does it return NULL?
Here is a simple script which shows the problem by printing NULL
:
<?php
$str = <<< EOF
Aim
To compare the clinical evolution of >70% asymtomatic in-stent restenosis in superficial femoral artery. (SFA) treated with endovascular procedure or conservative treatment.
Methods
Historical cohort study was performed in patients with femoral artery stent with an asymptomatic in-stent restenosis of 70-99%. Two groups: Conservative managment vs endovascular treatment.
In both groups we compared: Limb salvage and critical ischemia or limited claudication free survival rates ( <250m) , from initial stent treatment until the appearance of critical ischemia o major amputation ( Log-Rank, Kaplan Meier).
Primary permeability and assisted primary permeability were analized.
Results
From January 2010-December 2015, twenty three >70% in-stent restenosis were diagnosed in 20 patients with an average age of 78 years old (SD 9,6). Average follow-up was of 30,1 months. 12 patients (52,2%) received medical (Aspirin and Statins) and endovascular treatment. 11 patients (47,8%) received only medical treatment without an endovascular procedure.
No difference were seen in limb salvage.
Critical ischemia or limited claudication free survival rates were higher in the conservative treatment group vs endovascular group. (P=0,031)
Primary permeability was 14,2 months ( IC 8,2-20.2). Assisted primary permeability, 65,5 months ( IC 57,4-77,5).
Conclusions
In our experience treatment of severe asymtomatic lesions of in-stent stensosis in SFA does not reduce the risk of amputation, critical ischaemia o limitant claudication. In asymptomatic patients, conservative treatment could be a good option, without increasing the risk in amputation and critical ischema.
EOF;
$str = preg_replace('#(<[^>]+.*?)(onabort|onactivate|onafterprint|onafterupdate|onbeforeactivate|onbeforecopy|onbeforecut|onbeforedeactivate|onbeforeeditfocus|onbeforepaste|onbeforeprint|
onbeforeunload|onbeforeupdate|onblur|onbounce|oncanplay|oncanplaythrough|oncellchange|onchange|
onclick|oncontextmenu|oncontrolselect|oncopy|oncuechange|oncut|ondataavailable|ondatasetchanged|
ondatasetcomplete|ondblclick|ondeactivate|ondrag|ondragend|ondragenter|ondragleave|ondragover|
ondragstart|ondrop|ondurationchange|onemptied|onended|onerror|onerrorupdate|onfilterchange|
onfinish|onfocus|onfocusin|onfocusout|onformchange|onforminput|onhashchange|onhelp|oninput|oninvalid,|onkeydown|
onkeypress|onkeyup|onlayoutcomplete|onload|onloadeddata|onloadedmetadata|onloadstart|
onlosecapture|onmessage|onmousedown|onmouseenter|onmouseleave|onmousemove|onmouseout|
onmouseover|onmouseup|onmousewheel|onmove|onmoveend|onmovestart|onoffline|ononline|
onpagehide|onpageshow|onpaste|onpause|onplay|onplaying|onpopstate|onprogress|
onpropertychange|onratechange|onreadystatechange|onredo|onreset|onresize|onresizeend|
onresizestart|onrowenter|onrowexit|onrowsdelete|onrowsinserted|onscroll|onseeked|onseeking|
onselect|onselectionchange|onselectstart|onshow|onstalled|onstart|onstop|onstorage|onsubmit|
onsuspend|ontimeupdate|onundo|onunload|onvolumechange|onwaiting)[^>]*>#iU',"\\1>",$str);
var_dump($str);
My system is OSX 10.11.4 and
PHP 5.6.17 (cli) (built: Jan 8 2016 10:27:48)
Copyright (c) 1997-2015 The PHP Group
Zend Engine v2.6.0, Copyright (c) 1998-2015 Zend Technologies
with Zend OPcache v7.0.6-dev, Copyright (c) 1999-2015, by Zend Technologies
with Xdebug v2.3.3, Copyright (c) 2002-2015, by Derick Rethans
Upvotes: 1
Views: 683
Reputation: 626758
See the preg_replace
reference:
If matches are found, the new subject will be returned, otherwise subject will be returned unchanged or NULL if an error occurred.
The error is catastrophic backtracking - see your regex demo.
To fix this, you need to either remove the [^>]+
at the start or keep [^>]+
and remove .*?
(depending on what you need to match). The point is that [^>]+
and .*?
can match at the same location and since the patterns are followed with a huge alternation group (where all alternatives start with the same on
substring letting all of them match at the same location), this creates a huge amount of possible variations that the regex engine has to check before admitting there is a failure - leading to catastrophic backtracking.
You can also greatly speed up the pattern if you put on
outside the (...)
:
(<[^>]+)on(abort|activate|afterprint|afterupdate|beforeactivate|beforecopy|beforecut|beforedeactivate|beforeeditfocus|beforepaste|beforeprint|
onbeforeunload|beforeupdate|blur|bounce|canplay|canplaythrough|cellchange|change|
onclick|contextmenu|controlselect|copy|cuechange|cut|dataavailable|datasetchanged|
ondatasetcomplete|dblclick|deactivate|drag|dragend|dragenter|dragleave|dragover|
ondragstart|drop|durationchange|emptied|ended|error|errorupdate|filterchange|
onfinish|focus|focusin|focusout|formchange|forminput|hashchange|help|input|invalid,|keydown|
onkeypress|keyup|layoutcomplete|load|loadeddata|loadedmetadata|loadstart|
onlosecapture|message|mousedown|mouseenter|mouseleave|mousemove|mouseout|
onmouseover|mouseup|mousewheel|move|moveend|movestart|offline|online|
onpagehide|pageshow|paste|pause|play|playing|popstate|progress|
onpropertychange|ratechange|readystatechange|redo|reset|resize|resizeend|
onresizestart|rowenter|rowexit|rowsdelete|rowsinserted|scroll|seeked|seeking|
onselect|selectionchange|selectstart|show|stalled|start|stop|storage|submit|
onsuspend|timeupdate|undo|unload|volumechange|waiting)[^>]*>
See the regex demo
NOTE: If you need to match up to the first on...
, turn (<[^>]+)
into a lazy subpattern - (<[^>]+?)
, and perhaps you also would like to add a word boundary before on
: (<[^>]+?)\bon
.
Upvotes: 6