shantanuo
shantanuo

Reputation: 32336

Unicode character do not return the correct results

This command works as expected and return 1 node.

# cat myfile.txt
<feed>
<entry>
<author>
<name>Amar joshi</name>
</author>
</entry>
</feed>

# xpath -e "/feed/entry[author/name='Amar joshi']" myfile.txt
Found 1 nodes in myfile.txt:

But this does not.

<feed>
<entry>
<author>
<name>संतोष गोरे</name>
</author>
</entry>
</feed>

xpath -e "/feed/entry[author/name='संतोष गोरे']"  myfile.txt

The file and command are very similar. The unicode text should have no problem. I have checked it using the utility that I found here...

http://xpather.com/

Upvotes: 1

Views: 108

Answers (1)

nwellnhof
nwellnhof

Reputation: 33638

This is probably a bug in the Perl module XML::XPath which the xpath utility is part of. It seems that command-line arguments aren't properly decoded from UTF-8. It might work to run

PERL5OPT=-CA xpath -e "/feed/entry[author/name='संतोष गोरे']"  myfile.txt

Upvotes: 2

Related Questions