Reputation: 344
I have a CSV file that contains two columns. First column is a list of all subscribers and second column is a list of subscribers who need to be excluded from a mailing:
all,exclusions
[email protected],[email protected]
[email protected],[email protected]
[email protected]
[email protected]
[email protected]
I need to end up with an output of all subscribers from first column who are not listed in the second column. The desired output is something like this:
[email protected]
[email protected]
[email protected]
So far all I have is this:
awk -F, '(NR>1) {if($1!=$2) {print}}' subs.csv
This of course will only filter out the rows when there are matching values in both columns on the same row. Thanks for any help.
Upvotes: 3
Views: 171
Reputation: 1126
With two arrays. First field $1
is the list of all subscribers
and this is used as an index of an array called a
. Second field $2
is the list of subscribers who need to be excluded
and it is used as index of array b
. We get subscribers from first column who are not listed in the second column
this way in the END
part: for (i in a) if (!(i in b)) print i
using the two arrays:
awk -v FS=',' '
NR > 1 {a[$1];b[$2]}
END{for (i in a) if (!(i in b)) print i}
' file
[email protected]
[email protected]
[email protected]
Or using the continue
statement which causes the next iteration to begin.
awk -v FS=',' '
NR > 1 {a[$1];b[$2]}
END{for (i in a) if (i in b) continue;else print i}
' file
[email protected]
[email protected]
[email protected]
Upvotes: 1
Reputation: 88819
With an array. I assume that there are no duplicates in the first column.
awk -F ',' 'NR>1{
array[$1]++; array[$2]--
}
END{
for(i in array){ if(array[i]==1){ print i } }
}' file
As one line:
awk -F ',' 'NR>1{ array[$1]++; array[$2]-- } END{for(i in array){ if(array[i]==1){ print i } } }' file
Output:
[email protected] [email protected] [email protected]
Upvotes: 2
Reputation: 19191
For completeness, remove excluded entries, including repeated values.
Data
$ cat file
all,exclusions
[email protected],[email protected]
[email protected],[email protected]
[email protected]
[email protected]
[email protected],[email protected]
[email protected],[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
$ awk -F ',' 'NR>1 && NF==1{ all[$1]++ }
NR>1 && NF==2{ all[$1]++; excl[$2]++ }
END{ for(i in excl){ all[i]=0 };
for(i in all){ if(all[i]>=1){ print i } } }' file
[email protected]
[email protected]
[email protected]
Upvotes: 2