Atomiklan
Atomiklan

Reputation: 5434

Extracting text from the 1st and 4th column of a HTML table row with "grep"

Can anyone see what I'm doing wrong in the grep statement? I think I'm just missing an escape character.

for i in "${!PORTARR[@]}"; do
  grep \<td\>"${!PORTARR[i]}"\<\/td\> tmp/portlist >> databases/ports.db
done

*** UPDATE ***

Well unfortunately that's not going to work. Here is ultimately what I am trying to do.

From this string:

<tr><td>4</td><td>TCP</td><td>UDP</td><td>Unassigned</td><td>Official</td></tr>

I need to get this:

4,Unassigned

Upvotes: 1

Views: 287

Answers (1)

ruakh
ruakh

Reputation: 183301

I suspect that you meant to write ${PORTARR[i]} (the ith element of PORTARR) instead of ${!PORTARR[i]} (the value of the variable named by the ith element of PORTARR); so:

for i in "${!PORTARR[@]}"; do
  grep \<td\>"${!PORTARR[i]}"\<\/td\> tmp/portlist >> databases/ports.db
done

But I'd recommend a few other tweaks as well:

for elem in "${PORTARR[@]}"; do
  grep "<td>$elem</td>" tmp/portlist
done > databases/ports.db

Update: For your updated question, I think you're better off using a real programming language, like Perl:

perl -we ' my @ports = @ARGV;
           @ARGV = ();
           my %ports = map +($_ => undef), @ports;
           while(<>) {
               my $fields = m/<td>([^<]*)<\/td>/g;
               if(exists $ports{$fields[0]}) {
                   $ports{$fields[0]} = $fields[3];
               }
           }
           foreach my $port (@ports) {
               if(defined $ports{$port}) {
                   print "$port,$ports{$port}\n";
               }
           }
         ' "${PORTARR[@]}" < tmp/portlist > databases/ports.db

(Disclaimer: not tested.)

Upvotes: 3

Related Questions