Regex, backreferences and alternations

Question

I'm trying to modify some texts using regex. This is the original text:

  "Insert Swab to Start Analysis"

And this is the desired text:

  "Insert Swab to Start Analysis"
  "Insert Swab to Start Analysis"
  "Insert Swab to Start Analysis"
  "Insert Swab to Start Analysis"
  "Insert Swab to Start Analysis"
  "Insert Swab to Start Analysis"

As you can see there have been two changes: modify the tags and copy the source text into the target languages.

I managed to do this using two different regex.

First regex (copy source text into target languages):

Search: ((.+?)()
  \1"es">\3
  \1"fr">\3
  \1"de">\3
  \1"pt">\3
  \1"du">\3
Replace: \1"en">\2\3
  \1"es">\2\3
  \1"fr">\2\3
  \1"de">\2\3
  \1"pt">\2\3
  \1"du">\2\3

Second regex (change tags):

Search: (.*?)(]*>)
Replace: <\1\>\2

I'm quite happy with the result but I'm wondering if all this can be done using a single regex and not two. The second regex I used is quite elegant but it does not copy the source text into the different target languages. I suspect it needs a little trick to work properly. Suggestions?

PD: I'm just using Notepad++ to do all this.

PD: It's a big XML file with many entries, not only the one I'm showing you here.

Wiktor Stribiżew · Accepted Answer

Only if the string is always formatted the same way, you may just amend the first regex to do the whole job for you:

Find What: ((.+?)()\R \1es">\3\R \1fr">\3\R \1de">\3\R \1pt">\3\R \1du">\3
Replace With: \2 \2 \2 \2 \2 \2

See the regex demo

Details

(


en"> - literal text en">
(.+?) - Group 2: any 1 or more chars other than line break chars, as few as possible
() - Group 3: literal text 
\R - any line break sequence
   - two spaces
\1 - the text captured in Group 1
es"> - literal text es">
\3 - the text captured in Group 3
\R  \1fr">\3\R  \1de">\3\R  \1pt">\3\R  \1du">\3 - this is already clear from the above description.

Regex, backreferences and alternations

Answers (1)

Related Questions