Enissay
Enissay

Reputation: 4953

Remove weird space from xml node (xPath/xQuery)

I'm using Web-Harvest to scrap a website and generate xml file with data.

I'm having ugly nodes like <name> </name>, using normalize-space() didn't help so I opened the file in Hex view, and I found it corresponds to 'c2a0'. I looked arround for a solution, but no one helped...

To sum up, what I want is to remove that weird space (using xquery or xpath1/2), so I can get an empty node <name/>

ps: the used encoding is 'iso-8859-1'

Upvotes: 0

Views: 1096

Answers (1)

BeniBela
BeniBela

Reputation: 16917

You can use translate to remove certain characters. And utf8 c2a0 is the character U+00A0, hexadecimal 0xA0 is 160, so you can use codepoints-to-string(160) to get a string with the space.

Together:

translate(your node text, codepoints-to-string(160), "")

Upvotes: 1

Related Questions