Reputation: 4054
I'm working on a small querying module (in js) for html and I want to provide a generic query(selector)
function supporting both, css selectors and XPath selectors as string argument.
Regardless of how each kind of selection is done, my problem here is how to identify whether a given string is an xpath or a css selector. We can assume that the function would be something like this:
function query(selector){
selectorKind = identifySelectorKind(selector); // I want to know how to code this particular function
if(selectorKind==="css") return queryCss(selector);
if(selectorKind==="xPath") return queryXPath(selector); //Assume both functions exists and work
}
My first approach (given my limited knowledge of xPath queries) was to identify the query kind by checking if the first character is /
(here I am assuming all relevant xPath queries begin with /
)
So, identifySelectorKind
would go a bit like this:
function identifySelectorKind(selector){
if (selector[0] === "/") return "xPath";
else return "css";
}
Note that I don't need to validate neither css nor xpath selectors, I only need an unambiguous way to differentiate them. Would this logic be enough? (in other words, all xPath selectors begin with /
and no css selector begins the same way?), if not, is there a better way or some considerations I may want to know?
Upvotes: 0
Views: 757
Reputation: 278
Searching only for /
won't be enough, for sure!
Exemple CSS selector (that will be a false positive):
nav [itemtype="https://schema.org/BreadcrumbList"]
I'm writing also a utility function to either use querySelector or xpath, and need to differenciate the 2.
The problem here is that both syntax can have arbitrary strings in it:
xpath: //*[contains(text(),"string")]
css: *[some-attr="string"]
...so it's always possible to have, whatever char you use to descriminate, in both syntax. (A xpath string in css is valid, and so a css string in xpath):
xpath: //*[contains(text(),"a:hover:not(xpath)")]
css: *[xpath-attr="fuuu/xpath/also//here/*"]
The quick and dirty solution I found is to cut out first all the quoted strings, and then test for xpath only char (actually /
or @
).
const isXpath = str=>
/[\/@]/.test( // find / or @ in
str.split(/['"`]/) // cut on any quote
.filter( (s,i)=> !(i%2) ) // remove 1 on 2
.join('') // string without quotes
)
isXpath( 'nav [itemtype="https://schema.org/BreadcrumbList"] [itemtype="https://schema.org/ListItem"]' )
//> false
// Actually search chars on "nav [itemtype=] [itemtype=]"
/!\ Note this is not perfect, and some cases will be confusing like the exemples given in this discussion
*
ordiv
will fall back to CSS (isXpath = false). You may perfect quoted string cut out (what about escaped quotes?) and then xpath chars...
Upvotes: 1
Reputation: 82986
You can't necessarily. For example, *
is a valid xpath and a valid css selector, but it matches a different set of elements in each.
Upvotes: 2
Reputation: 44107
If you're absolutely sure your XPath selector will always begin with /
, then yes, it's fine. Note that an XPath selector doesn't have to begin with a /
, but if yours always selects from the root, then it's fine.
Upvotes: 0