zundarz
zundarz

Reputation: 1594

How to exclude original $0 in this awk script?

Using Cygwin64 here.

Here's an extract of my file. Notice the product_id is not unique.

    <tr>
    <td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
    <td>Crate</td>
    </tr>
    <tr>
    <td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
    <td>Crate</td>
    </tr>

I want to make the product_id unique by concatentating the rownumber after QW.

The following awk script does what I need, but it also prints the original row below the new row. If I exclude {print $0}, then I only get the product_id rows.

awk '/LRZ/ {x=NR; print substr($0,1,33) x substr($0,34,12) x substr($0,46);} {print $0}' my_file.html

CURRENT RESULTS

    <tr>
    <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
    <td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
    <td>Crate</td>
    </tr>
    <tr>
    <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
    <td product_id="LRZCQPLRQW">LRZCQPLRQW</td>
    <td>Crate</td>
    </tr>

DESIRED RESULTS

    <tr>
    <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
    <td>Crate</td>
    </tr>
    <tr>
    <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
    <td>Crate</td>
    </tr>

Upvotes: 0

Views: 82

Answers (4)

Ed Morton
Ed Morton

Reputation: 203712

I've no idea why the answers so far are so complicated. Isn't this all you need?

$ awk '{gsub(/LRZ[^"<]+/,"&"NR)}1' file
    <tr>
    <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
    <td>Crate</td>
    </tr>
    <tr>
    <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
    <td>Crate</td>
    </tr>

Upvotes: 4

RavinderSingh13
RavinderSingh13

Reputation: 133545

Try following awk too once without hardcoding any place value here and simply by substituting the letters coming in "" and ><.

awk '/product_id/{sub(/\".[^"]*/,"&"NR);sub(/>.[^<]*/,"&"NR);} 1'  Input_file

EDIT: Adding output as per OP's request here.

awk '/product_id/{sub(/\".[^"]*/,"&"NR);sub(/>.[^<]*/,"&"NR);} 1' Input_file
<tr>
<td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
<td>Crate</td>
</tr>
<tr>
<td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
<td>Crate</td>
</tr>

Upvotes: 1

jas
jas

Reputation: 10865

The next statement will keep awk from continuing to execute actions if you just want to move to the next line of input:

 $ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46); next} {print $0}' file
   <tr>
   <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
   <td>Crate</td>
   </tr>
   <tr>
   <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
   <td>Crate</td>
   </tr>

Or if you prefer, you can simply negate the pattern for when you want to print the original line as is:

$ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46)}
      $0 !~ /LRZ/ {print $0}' file
   <tr>
   <td product_id="LRZCQPLRQW2">LRZCQPLRQW2</td>
   <td>Crate</td>
   </tr>
   <tr>
   <td product_id="LRZCQPLRQW6">LRZCQPLRQW6</td>
   <td>Crate</td>
   </tr>

Often this would be written more idiomatically as:

$ awk '/LRZ/ {print substr($0,1,33) NR substr($0,34,12) NR substr($0,46); next}1' file

using the next statement and the always-true pattern 1 whose default action is to print the original line.

Upvotes: 1

paxdiablo
paxdiablo

Reputation: 881703

Simply put a next as the final command in your LRZ processing section, this will immediately move to the next line:

/LRZ/{x=NR;print substr($0,1,33) x substr($0,34,12) x substr($0,46);next}{print $0}

Upvotes: 1

Related Questions