rally_point
rally_point

Reputation: 89

What's the correct Xpath substring for selecting date in a phrase?

I need to use Xpath to select the date from the following string:

44kb - Mr John Doe - 1/1/13

I don't believe you can select the third iteration of the '-' for something like

substring-after($string, '-'[3])

How would I do this? Is there a way to grab the substring from the space before the first '/' to the end of the date?

Thanks in advance

Upvotes: 2

Views: 212

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243449

If there are just two dashes, as in the provided example, one can simply use this XPath 1.0 expression:

substring-after(substring-after('44kb - Mr John Doe - 1/1/13', '- '), '- ')

If it is known that the date is what the string ends with, and that the length of the date is 6, then one can use:

substring('44kb - Mr John Doe - 1/1/13', string-length('44kb - Mr John Doe - 1/1/13') -5)

Alternatively:

translate(substring('44kb - Mr John Doe - 1/1/13', 
                    string-length('44kb - Mr John Doe - 1/1/13') -7),
         '- ', '')

Here we don't know the length of the date in advance, so we take the last 8 characters and in these we delete any dashes or spaces.

Upvotes: 1

Jens Erat
Jens Erat

Reputation: 38682

fn:substring-after(...) only splits once, so you will have to apply it twice.

substring-after(substring-after('44kb - Mr John Doe - 1/1/13', ' - '), ' - ')

If your XPath processor supports it (at least XPath 2.0), you can also use fn:tokenize(...) to split to all parts and then use a position predicate to fetch the third one.

tokenize("44kb - Mr John Doe - 1/1/13", ' - ')[3]

If the number of parts can vary, but the date is always the last one, you can also use

tokenize("44kb - Mr John Doe - 1/1/13", ' - ')[last()]

which always matches the last part.

Upvotes: 2

Related Questions