Reputation: 927
I am trying to use sed to replace every other occurrence of an html element of a file so I can make alternating color rows.
Here is what I have tried and it doesn't work.
sed 's/<tr valign=top>/<tr valign=top bgcolor='#E0E0E0'>/2' untitled.html
Upvotes: 15
Views: 23131
Reputation: 4243
you can use python script to fix the html
from bs4 import BeautifulSoup
html_doc = """
<table>
<tr><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>
"""
soup = BeautifulSoup(html_doc, 'html.parser')
index=0
for tr in soup.find_all('tr'):
if tr.find('td'):
if index % 2:
tr.find('td').attrs['style'] = 'background-color: #ff0000;'
else:
tr.find('td').attrs['style'] = 'background-color: #00ff00;'
index+=1
print(soup)
Upvotes: 0
Reputation: 277
According to http://www.linuxquestions.org/questions/programming-9/replace-2nd-occurrence-of-a-string-in-a-file-sed-or-awk-800171/
Try this.
sed '0,/<tr/! s/<tr/<TR bgcolor='#E0E0E0'/' file.txt
The exclamation mark negates everything from the beginning of the file to the first "Jack", so that the substitution operates on all the following lines. Note that I believe this is a gnu sed operation only.
If you need to operate on only the second occurrence, and ignore any subsequent matches, you can use a nested expression.
sed '0,/<tr/! {0,/<tr/ s/<tr/<TR bgcolor='#E0E0E0'/}' file.txt
Here, the bracketed expression will operate on the output of the first part, but in this case, it will exit after changing the first matching "Jack".
PS, I've found the sed faq to be very helpful in cases like this.
Upvotes: 0
Reputation: 192467
This works for me:
sed -e "s/<tr/<TR bgcolor='#E0E0E0'/g;n" simpletable.htm
sample input:
<table>
<tr><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>
sample output:
<table>
<TR bgcolor='#E0E0E0'><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
<TR bgcolor='#E0E0E0'><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
<tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
<TR bgcolor='#E0E0E0'><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>
The key is to use the n
command in sed, which advances to the next line.
This works only if the TR occupy distinct lines.
It will break with nested tables, or if there are multiple TR's on a single line.
Upvotes: 5
Reputation: 27050
I'd solve it with awk:
awk '/<tr valign=top>/&&v++%2{sub(/<tr valign=top>/, "<tr valign=top bgcolor='#E0E0E0'>")}{print}' untitled.html
First, it verifies if the line contains <tr valign=top>
/<tr valign=top>/&&v++%2
and whether the <tr valign=top>
is an odd found instance:
v++%2
If so, it replaces the <tr valign=top>
in the line
{sub(/<tr valign=top>/, "<tr valign=top bgcolor='#E0E0E0'>")}
Since all lines are to be printed, there is a block that always will be executed (for all lines) and will print the current line:
{print}
Upvotes: 11