Reputation: 1594
Using Cygwin64 here.
Here's an extract of my file. Notice the product_id is not unique.
<tr>
<td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
<td>Crate</td>
</tr>
I want to make the product_id unique by concatentating the rownumber after QW.
The following awk script does what I need, but it also prints the original row
below the new row. If I exclude {print $0}
, then I only get the product_id rows.
awk '/LRZ/ {x=NR; print substr($0,1,33) x substr($0,34,12) x substr($0,46);} {print $0}' my_file.html
CURRENT RESULTS
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
<td>Crate</td>
</tr>
DESIRED RESULTS
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>
Upvotes: 0
Views: 82
Reputation: 203712
I've no idea why the answers so far are so complicated. Isn't this all you need?
$ awk '{gsub(/LRZ[^"<]+/,"&"NR)}1' file
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>
Upvotes: 4
Reputation: 133545
Try following awk too once without hardcoding any place value here and simply by substituting the letters coming in "" and ><.
awk '/product_id/{sub(/\".[^"]*/,"&"NR);sub(/>.[^<]*/,"&"NR);} 1' Input_file
EDIT: Adding output as per OP's request here.
awk '/product_id/{sub(/\".[^"]*/,"&"NR);sub(/>.[^<]*/,"&"NR);} 1' Input_file
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>
Upvotes: 1
Reputation: 10865
The next
statement will keep awk from continuing to execute actions if you just want to move to the next line of input:
$ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46); next} {print $0}' file
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>
Or if you prefer, you can simply negate the pattern for when you want to print the original line as is:
$ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46)}
$0 !~ /LRZ/ {print $0}' file
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>
Often this would be written more idiomatically as:
$ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46); next}1' file
using the next
statement and the always-true pattern 1
whose default action is to print the original line.
Upvotes: 1
Reputation: 881703
Simply put a next
as the final command in your LRZ
processing section, this will immediately move to the next line:
/LRZ/{x=NR;print substr($0,1,33) x substr($0,34,12) x substr($0,46);next}{print $0}
Upvotes: 1