Oliver Ruehl
Oliver Ruehl

Reputation: 137

Regex / Replace the same string in a text file with predefined values from a dictionary

I have a huge text file with contents similar to this:

<!-- $var = aaa -->
<!-- $var = aaa -->
<!-- $var = aaa -->
<!-- $var = aaa -->
<!-- $var = aaa -->
.
<!-- $var = bbb -->
<!-- $var = bbb -->
<!-- $var = bbb -->
<!-- $var = bbb -->
<!-- $var = bbb -->

I want to achieve that the $var will be replaced like this:

<!-- $address = aaa -->
<!-- $city    = aaa -->
<!-- $zip     = aaa -->
<!-- $phone   = aaa -->
<!-- $geo     = aaa -->
.
.
<!-- $address = bbb -->
<!-- $city    = bbb -->
<!-- $zip     = bbb -->
<!-- $phone   = bbb -->
<!-- $geo     = bbb -->

The sequence is always the same. I have researched for about 3 hours, but I can't get past this brain hurdle. My idea was to realize this with regex, but it seems I need a script to wrestle with this.

Can you give me a hint into which direction to go and is this possible with Regex at all? I'm a beginner, so please be gentle :)

Kind regards Oliver

Upvotes: 0

Views: 190

Answers (3)

Xophmeister
Xophmeister

Reputation: 9211

You could write a regular expression to do this in one foul swoop, but it would be way easier to use something like sed, which will target lines individually.

#!/bin/sh
sed '
  s/\$var/\$address/  # replace $var with $address
  N                   # next line
  s/\$var/\$city   /  # replace $var with $city
  N                   # next line
  s/\$var/\$zip    /  # replace $var with $zip
  N                   # next line
  s/\$var/\$phone  /  # replace $var with $phone
  N                   # next line
  s/\$var/\$geo    /  # replace $var with $geo
' $1

You can then run this script against your file.

Upvotes: 1

Buh Buh
Buh Buh

Reputation: 7546

This should do it. I have tested this using Progammer's Notepad. If you are using something different then you may need to tweek it.

Find:
(<!-- \$var = (\w+) -->\r\n){5}

Replace:
<!-- $address = \2 -->\r\n<!-- $city    = \2 -->\r\n<!-- $zip     = \2 -->\r\n<!-- $phone   = \2 -->\r\n<!-- $geo     = \2 -->\r\n

The key to understanding this is the \2. This references the second group found by the regex. A group is something trapped in parenthsis ().

\0 would match the entire string.
\1 would match the first set of backets. (<!-- \$var = (\w+) -->\r\n)
\2 would match the second set of backets. (\w+) This is your aaa or bbb

Upvotes: 1

tomsv
tomsv

Reputation: 7277

You need to handle one row at a time, applying a different regexp depending on which row it is, as in (I do not know which language you need so see this as pseudocode, and it can be optimized if needed)

var replacements = new[]{"address","city","zip","phone","geo"};
var replacement = replacements[row % 5];
var r = new Regex("(^<!-- \$)var(.*$)";
var newline = r.Replace(oldline, "$1"+replacement+"$2");

Upvotes: 1

Related Questions