neo5_50
neo5_50

Reputation: 476

Trying to get v. simple Regex to work on XML file

This is my snippet of XML (the actual full file is 6964 lines):

<?xml version="1.0" encoding="UTF-8"?>
<listings xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchema Location="http://www.gstatic.com/localfeed/local_feed.xsd">
<language>en</language>
<id>43927</id>
<cell1>Andover House</cell1>
<cell2>28-30 Camperdown</cell2>
<cell3>Great Yarmouth</cell3>
<cell4>NR30 3JB</cell4>
<cell5>GB</cell5>
<cell6>52.6003767</cell6>
<cell7>1.7339649</cell7>
<cell8>+44 1493843490</cell8>
<category>British</cell9>
<cell10>http://contentadmin.livebookings.com/dynamaster/image_archive/original/f24c60a52e7ac0874be57e51bce30726.jpg</cell10>
<cell11>http://www.bookatable.co.uk/andover-house-great-yarmouth-norfolk</cell11>

For each category tag in the above snippet, I would simply like to add this text: Restaurant - (with one whitespace after the hyphen)

So the final result will be:

<category>Restaurants - British</category>

I am very new to Regex and find it very difficult, so this is what I've tried so far: https://regex101.com/r/yY5jB6/2

It looks like it is working in Regex 101 but when I bring it into a text editor like Sublime 2 (on Mac) and Notepad ++ (on Windows) using find/replace (specifying regex in settings), it says it can't find anything. Please help! Thanks!

Upvotes: 1

Views: 54

Answers (2)

Rossiar
Rossiar

Reputation: 2564

NotePad++ uses \1 instead of $1, if you change your substitution from $1Restaurants - to \1Restaurants - then it should work. (sourced from this question)

Upvotes: 1

h3n
h3n

Reputation: 898

if you search for

<category>([^<]*)<\/.*>

and replace it with

<category>Restaurants - $1</category>

it would even work with your strange input that contains a </item9> tag.

Upvotes: 0

Related Questions