Reputation: 1
I have a Pipe delimited file (sample below) and I need to delete records which has Null value in fields 2(email),4(mailing-id),6(comm_id). In this sample, row 2,3,4 should be deleted. The output should be saved to another file. If 'awk' is the best option, please let me know a way to achieve this
id|email|date|mailing-id|seg_id|comm_id|oyb_id|method
|[email protected]|2010-06-23 11:47:00|0|1234|INCLO|1000002|unknown
||2010-06-23 11:47:00|0|3984|INCLO|1000002|unknown
|[email protected]|2010-06-23 11:47:00|0||INCLO|1000002|unknown
|[email protected]|2010-06-23 11:47:00|0||INCLO|1000002|unknown
|[email protected]|2010-06-23 11:47:00|0|454|INCLO|1000002|unknown
Upvotes: 0
Views: 4076
Reputation: 327
Steve is right, it is field 2 and 5 that are missing in the sample given. Email missing for line two and the seq_id missing for line three and four
This is a slightly simplified version of steve's solution
awk -F "|" ' $2!="" && $5!=""' file.txt > results.txt
If column 2,4 and 6 are the important one, the solution would be:
awk -F "|" ' $2!="" && $4!="" && $6!=""' file.txt > results.txt
Upvotes: 1
Reputation: 58430
This might work for you:
sed 'h;s/[^|]*/\n&/2;s/[^|]*/\n&/4;s/[^|]*/\n&/6;/\n|/d;x' file.txt > results.txt
Upvotes: 0
Reputation: 54402
Here is an awk
solution that may help. However, to remove rows 2, 3 and 4, it is necessary to check for null vals in fields 2 and 5 only (i.e. not fields 2, 4 and 6 like you have stated). Am I understanding things correctly? Here is the awk
to do what you want:
awk -F "|" '{ if ($2 == "" || $5 == "") next; print $0 }' file.txt > results.txt
cat results.txt:
id|email|date|mailing-id|seg_id|comm_id|oyb_id|method
|[email protected]|2010-06-23 11:47:00|0|1234|INCLO|1000002|unknown
|[email protected]|2010-06-23 11:47:00|0|454|INCLO|1000002|unknown
HTH
Upvotes: 1