Reputation: 369
I want to get just the elements with this id pattern "answer-[0-9]*"
I'm using this regex in select "div[id~=answer-[0-9]*]"
The matching elements are:
<div class="post-text" id="answer-45881">
and
<div class="hidden modal modal-flag" id="answer-flag-modal45881">
What must I change to get only the first one?
Upvotes: 2
Views: 6034
Reputation: 124275
Based on example from official tutorial
[attr~=regex]: elements with attribute values that match the regular expression;
e.g. img[src~=(?i)\.(png|jpe?g)]
it looks like jsoup simply checks if attribute contains some part which can be matched with regex (like in this example .png
or .jpg
), not if entire value of attribute is matched by regex.
To check if regex matches entire string you need to place anchors representing start of the string ^
and end of the string $
.
Also instead of *
you probably should use +
if you want to make number part mandatory.
So try with div[id~=^answer-[0-9]+$]
Upvotes: 4
Reputation: 70732
The *
operator means "zero or more" times so it will still match the second example. You need to use the +
operator instead meaning "one or more" times. So, your syntax would be:
div[id~=answer-[0-9]+]
Upvotes: 2
Reputation: 264
It looks like it searches id to contain this pattern, not to match.
"div[id~=answer-[0-9]*$]"
should work then.
Upvotes: 1