chrisbg
chrisbg

Reputation: 5

How to extract parts of text with XPath?

I want to extract words (text) of given XML example:

<description>
[Партиден номер]: 2UW01AA [Номер на модела]: HP 14.1 Business Sleeve [Line]: Business [Screen size]: 14.1&quot; [Material]: Polyester [Color]: Black [Dimensions]: [more]
</description>

like this:

Партиден номер 2UW01AA
Номер на модела HP 14.1 Business Sleeve
Line Business
Screen size 14.1&quot;
Material Polyester
Color Black

Which exactly x-path can i use for this result?

Upvotes: 1

Views: 193

Answers (3)

E.Wiest
E.Wiest

Reputation: 5915

Another way to do it with XPath 2.0 :

translate(substring-before(substring-after(//description,"["),": ["),"[]:",codepoints-to-string(10)  )

Output :

Партиден номер 2UW01AA 
Номер на модела HP 14.1 Business Sleeve 
Line Business 
Screen size 14.1" 
Material Polyester 
Color Black 
Dimensions

Upvotes: 1

Mads Hansen
Mads Hansen

Reputation: 66781

You could use the fn:replace() function with a regex capture group:

replace(/description, "\[(.*?)\]:", "&#10;$1")

Upvotes: 1

Pete Kirkham
Pete Kirkham

Reputation: 49331

XPath will give you the description element, then you can use the replace function to remove the square brackets or replace them with line feeds.

Something like this, though the regexes will need to be more complicated if you need to handle square brackets in the values like [more]

replace(replace(normalize-space(description), '\[', '&#xa;'), '\]:','')

Upvotes: 0

Related Questions