scotthorvath
scotthorvath

Reputation: 21

Regular Expression, only replace first occurrence of HTML tag

I've got several files that have double <body> tags in them (either on purpose or by accident). I'm looking to find the first occurrence only of the <body> tag and append it with additional HTML code. But the second occurrence shouldn't be affected. I'm using TextWrangler. The regex I'm using now replaces both occurrences rather than just the first.

Text:

<body someattribute=...>
existing content
<body onUnload=...>

RegEx I'm using:

Find: (\<body.*\>)

Replace with: 

\n\1
appended HTML code

Current result:

<body someattribute=...>
appended HTML code
existing content
<body onUnload=...>
appended HTML code

So it's adding my appended code twice. I just want it to happen to the first <body...> only.

Upvotes: 1

Views: 2726

Answers (1)

tekim
tekim

Reputation: 181

Regex:

(?s)(<body.*?>)(.*)

Replace:

\1\nappended content\n\2

Explanation:

  • (?s) makes the . character match new lines. Without this, the . character will match all characters until it hits a new line character.
  • (<body.*?>) Finds the first "body" and captures as group 1 (\1).
  • (.*) Finds everything after the first "body", and captures as group 2 (\2).
  • Replaces everything that was found with group 1 + new line + appended content + new line + group 2

Tested in Notepad++

Upvotes: 3

Related Questions