Reputation: 6662
What is the best way to escape a var given to xpath.
$test = simplexml_load_file('test.xml');
$var = $_GET['var']; // injection heaven
$result = $test->xpath('/catalog/items/item[title="'.$var.'"]');
Normally I use PDO binding. OR stuff like that but they all require a database connection.
Is it enough to just addslashes
and htmlentities
.
Or is there a better to do this?
Upvotes: 2
Views: 1014
Reputation: 181
This answer is a supplemental to hanshenrik's answer, as I liked the general solution, but found the example function to be hard to read and not optimal regarding its results. It does it's job perfectly fine nonetheless.
About XPath quoting
XPath 1.0 allows any characters inside their literals except the quotes used to quote the literal. Allowed quotes are "
and '
, so quoting literals that contain at most one of those quotes is trivial. But to quote string with both you need to quote them in different strings and concatenate them with XPath's concat()
:
He's telling you "Hello world!".
would need to be escaped like
concat("He's telling", ' you "Hello world!".')
It is of course irrelevant where in between the '
and "
you split the literal.
Differences of Implementations
hanshenrik's implementation creates the quoted literal by extracting all parts that aren't double quotes and then inserting quoted double quotes. But that can produce undesirable results:
"""x'x"x""xx
would be escaped by their function like
concat('"', '"', '"', "x'x", '"', "x", '"', '"', "xx")
and the example from above:
concat("He's telling you ", '"', "Hello world!", '"', ".")
This implementation on the other side minimizes the amount of partial literals by alternating the quote and then quoting as much as possible:
for the first example:
concat("He's telling you ", '"Hello world!".')
and for the second example:
concat('"""x', "'x", '"x""xx')
Implementation
/**
* Creates a properly quoted xpath 1.0 string literal. It prefers double quotes over
* single quotes. If both kinds of quotes are used in the literal then it will create a
* compound expression with concat(), using as few partial strings as possible.
*
* Based on {@link https://stackoverflow.com/a/54436185/6229450 hanshenrik's StackOverflow answer}.
*
* @param string $literal unquoted literal to use in xpath expression
* @return string quoted xpath literal for xpath 1.0
*/
public static function quoteXPathLiteral(string $literal): string
{
$firstDoubleQuote = strpos($literal, '"');
if ($firstDoubleQuote === false) {
return '"' . $literal . '"';
}
$firstSingleQuote = strpos($literal, '\'');
if ($firstSingleQuote === false) {
return '\'' . $literal . '\'';
}
$currentQuote = $firstDoubleQuote > $firstSingleQuote ? '"' : '\'';
$quoted = [];
$lastCut = 0;
// cut into largest possible parts that contain exactly one kind of quote
while (($nextCut = strpos($literal, $currentQuote, $lastCut))) {
$quotablePart = substr($literal, $lastCut, $nextCut - $lastCut);
$quoted[] = $currentQuote . $quotablePart . $currentQuote;
$currentQuote = $currentQuote === '"' ? '\'' : '"'; // toggle quote
$lastCut = $nextCut;
}
$quoted[] = $currentQuote . substr($literal, $lastCut) . $currentQuote;
return 'concat(' . implode(',', $quoted) . ')';
}
Upvotes: 2
Reputation: 17598
According to the XPath 1.0 spec, the syntax for literals is as follows:
[29] Literal ::= '"' [^"]* '"'
| "'" [^']* "'"
Which means that in a single-quoted string, anything other than a single quote is allowed. In a double-quoted string, anything other than a double quote is allowed.
Upvotes: 2
Reputation: 2797
The above answers are for XPath 1.0, which is the only version PHP supports. For completeness, I'll note that starting with XPath 2.0, string literals can contain quotes by doubling them:
[74] StringLiteral ::= ('"' (EscapeQuot | [^"])* '"') | ("'" (EscapeApos | [^'])* "'")
[75] EscapeQuot ::= '""'
[76] EscapeApos ::= "''"
e.g. to search for the title Some "quoted" title
, you would use the following xpath:
/catalog/items/item[title="Some ""quoted"" title"]
This could be implemented with a simple string escape (but I won't give an example, since you're using PHP and as mentioned PHP does not support XPath 2.0).
Upvotes: 1
Reputation: 21463
you can't really make a general xpath escape
function, but you can make an XPath quote
function, which can be used like
$result = $test->xpath('/catalog/items/item[title='.xpath_quote($var).']');
implementation:
//based on https://stackoverflow.com/a/1352556/1067003
function xpath_quote(string $value):string{
if(false===strpos($value,'"')){
return '"'.$value.'"';
}
if(false===strpos($value,'\'')){
return '\''.$value.'\'';
}
// if the value contains both single and double quotes, construct an
// expression that concatenates all non-double-quote substrings with
// the quotes, e.g.:
//
// concat("'foo'", '"', "bar")
$sb='concat(';
$substrings=explode('"',$value);
for($i=0;$i<count($substrings);++$i){
$needComma=($i>0);
if($substrings[$i]!==''){
if($i>0){
$sb.=', ';
}
$sb.='"'.$substrings[$i].'"';
$needComma=true;
}
if($i < (count($substrings) -1)){
if($needComma){
$sb.=', ';
}
$sb.="'\"'";
}
}
$sb.=')';
return $sb;
}
and it's based on the C# xpath quote function from https://stackoverflow.com/a/1352556/1067003
Is it enough to just addslashes and htmlentities. Or is there a better to do this?
i would be sleeping better at night by using a proper xpath quote function, rather than addslashes/htmlentities, but i don't really know if those technically are sufficient or not.
Upvotes: 3