tim
tim

Reputation: 927

Sed replace every nth occurrence

I am trying to use sed to replace every other occurrence of an html element of a file so I can make alternating color rows.

Here is what I have tried and it doesn't work.

sed 's/<tr valign=top>/<tr valign=top bgcolor='#E0E0E0'>/2' untitled.html

Upvotes: 15

Views: 23131

Answers (4)

you can use python script to fix the html

from bs4 import BeautifulSoup

html_doc = """
<table>
   <tr><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
   <tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
   <tr><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
   <tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
   <tr><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>
 """

soup = BeautifulSoup(html_doc, 'html.parser')

index=0
for tr in soup.find_all('tr'):
    if tr.find('td'):
         if index % 2: 
             tr.find('td').attrs['style'] = 'background-color: #ff0000;'
         else:
             tr.find('td').attrs['style'] = 'background-color: #00ff00;'
     index+=1

 print(soup)

Upvotes: 0

Offenso
Offenso

Reputation: 277

According to http://www.linuxquestions.org/questions/programming-9/replace-2nd-occurrence-of-a-string-in-a-file-sed-or-awk-800171/

Try this.

sed  '0,/<tr/! s/<tr/<TR bgcolor='#E0E0E0'/' file.txt

The exclamation mark negates everything from the beginning of the file to the first "Jack", so that the substitution operates on all the following lines. Note that I believe this is a gnu sed operation only.

If you need to operate on only the second occurrence, and ignore any subsequent matches, you can use a nested expression.

sed  '0,/<tr/! {0,/<tr/ s/<tr/<TR bgcolor='#E0E0E0'/}' file.txt

Here, the bracketed expression will operate on the output of the first part, but in this case, it will exit after changing the first matching "Jack".

PS, I've found the sed faq to be very helpful in cases like this.

Upvotes: 0

Cheeso
Cheeso

Reputation: 192467

This works for me:

sed -e "s/<tr/<TR bgcolor='#E0E0E0'/g;n" simpletable.htm

sample input:

<table>
  <tr><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>

sample output:

<table>
  <TR bgcolor='#E0E0E0'><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
  <TR bgcolor='#E0E0E0'><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
  <TR bgcolor='#E0E0E0'><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>

The key is to use the n command in sed, which advances to the next line. This works only if the TR occupy distinct lines. It will break with nested tables, or if there are multiple TR's on a single line.

Upvotes: 5

brandizzi
brandizzi

Reputation: 27050

I'd solve it with awk:

awk '/<tr valign=top>/&&v++%2{sub(/<tr valign=top>/, "<tr valign=top bgcolor='#E0E0E0'>")}{print}' untitled.html 

First, it verifies if the line contains <tr valign=top>

/<tr valign=top>/&&v++%2

and whether the <tr valign=top> is an odd found instance:

v++%2

If so, it replaces the <tr valign=top> in the line

{sub(/<tr valign=top>/, "<tr valign=top bgcolor='#E0E0E0'>")}

Since all lines are to be printed, there is a block that always will be executed (for all lines) and will print the current line:

{print}

Upvotes: 11

Related Questions