Peaceful_Warrior
Peaceful_Warrior

Reputation: 59

AWK using file to remove csv rows

I have the following csv:

old.csv

irrelevant,irrelevant,[email protected],irrelevant
irrelevant,irrelevant,[email protected],irrelevant
irrelevant,irrelevant,[email protected],irrelevant
irrelevant,irrelevant,[email protected],irrelevant
irrelevant,irrelevant,[email protected],irrelevant
irrelevant,irrelevant,[email protected],irrelevant
irrelevant,irrelevant,[email protected],irrelevant

that I need to remove the rows containing emails from this file:

remove.txt

[email protected]
[email protected]
[email protected]
[email protected]

And I need the output to be this:

new.csv

irrelevant,irrelevant,[email protected],irrelevant
irrelevant,irrelevant,[email protected],irrelevant
irrelevant,irrelevant,[email protected],irrelevant

I've tried this, but it doesn't work. Can anyone help?

awk -F, 'BEGIN{IGNORECASE = 1};NR==FNR{remove[$1]++;next}!($1 in remove)' remove.txt old.csv > new.csv 

Upvotes: 0

Views: 102

Answers (2)

Ed Morton
Ed Morton

Reputation: 203512

  1. IGNORECASE is gawk-specific, you may not be using gawk.
  2. You're testing the wrong field.
  3. Incrementing the array element does nothing useful.

Try this:

awk -F, 'NR==FNR{remove[tolower($1)];next}!(tolower($3) in remove)' remove.txt old.csv > new.csv 

Upvotes: 1

sat
sat

Reputation: 14949

With grep:

grep -v -i -f remove.txt all.csv

Here,

  • -f - Obtain patterns from FILE, one per line.
  • -i - Ignore case
  • -v - Invert the matching

With awk:

awk -F, 'BEGIN{IGNORECASE=1} NR==FNR{a[$1]++;next} {for(var in a){if($3 ~ var){print}}}' remove.txt all.csv

Another awk:

awk -F, 'NR==FNR{a[tolower($1)]++;next} !(tolower($3) in a){print}' remove.txt all.csv

In your case, it won't work. Because,

IGNORECASE=1

will work only on if (x ~ /ab/) and not with array indexes.

index in array

Upvotes: 2

Related Questions