depsai
depsai

Reputation: 415

Regular Expression in xml parsing

This is the content.

 <ext-link ext-link-type="uri" xlink:href="http://<xref&#x00A0;rid="x0026;AN=15230473">http://web.ebscohost.coms/ehost/detail&#x0026;#x003F;sid=d1f06770-cd74-4496-ae7b-7689ed05c6c4%40sessionmgr10&#x0026;#x0026;vid=1&#x0026;#x0026;hid=23&#x0026;#x0026;bdata=JnNpdGU9ZWhvc3QtbGl2ZQ%3d%3d&#x0026;#x0023;db=ufh&#x0026;#x0026;AN=15230473</xref>" link-type="url">

I want capture inside xlink:href="http://<xref&#x00A0;rid="x0026;AN=15230473">http://web.ebscohost.coms/ehost/detail&#x0026;#x003F;sid=d1f06770-cd74-4496-ae7b-7689ed05c6c4%40sessionmgr10&#x0026;#x0026;vid=1&#x0026;#x0026;hid=23&#x0026;#x0026;bdata=JnNpdGU9ZWhvc3QtbGl2ZQ%3d%3d&#x0026;#x0023;db=ufh&#x0026;#x0026;AN=15230473</xref>"

with double quotes.

I try this but cant get the which i need.

<ext-link(?: [^>]+)? xlink:href="([^"]+)"[^><]*>

Upvotes: 0

Views: 101

Answers (3)

Avinash Raj
Avinash Raj

Reputation: 174844

Use \S+ to match one or more non-space characters.

<ext-link[^>]+? xlink:href="(\S+)"

DEMO

Upvotes: 1

vks
vks

Reputation: 67988

xlink:href=("(?:(?!<\/xref>).)*<\/xref>")

Try this.grab the capture.See demo.

http://regex101.com/r/zU7dA5/6

Upvotes: 0

Arjun Mathew Dan
Arjun Mathew Dan

Reputation: 5308

perl -pe 's/^.*xlink:href=\"//; s/\">$//' file

Example:

sdlcb@Goofy-Gen:~/AMD/SO$ cat file
<ext-link ext-link-type="uri" xlink:href="http://<xref&#x00A0;rid="x0026;AN=15230473">http://web.ebscohost.coms/ehost/detail&#x0026;#x003F;sid=d1f06770-cd74-4496-ae7b-7689ed05c6c4%40sessionmgr10&#x0026;#x0026;vid=1&#x0026;#x0026;hid=23&#x0026;#x0026;bdata=JnNpdGU9ZWhvc3QtbGl2ZQ%3d%3d&#x0026;#x0023;db=ufh&#x0026;#x0026;AN=15230473</xref>">


sdlcb@Goofy-Gen:~/AMD/SO$ perl -pe 's/^.*xlink:href=\"//; s/\">$//' file
http://<xref&#x00A0;rid="x0026;AN=15230473">http://web.ebscohost.coms/ehost/detail&#x0026;#x003F;sid=d1f06770-cd74-4496-ae7b-7689ed05c6c4%40sessionmgr10&#x0026;#x0026;vid=1&#x0026;#x0026;hid=23&#x0026;#x0026;bdata=JnNpdGU9ZWhvc3QtbGl2ZQ%3d%3d&#x0026;#x0023;db=ufh&#x0026;#x0026;AN=15230473</xref>

Upvotes: 0

Related Questions