Matt Rogers
Matt Rogers

Reputation: 109

Bash: Remove line from CSV if conditions met, then remove line from CSV if condition is not met

I hope you're all having a jolly nice day. I have a question which I could do with some help on. I have two CSV files, both which contain 250k records. One looks like this:

ProductCode,StockLevel,Active
4101385360060,0,1
6019002320030,0,0
3328200078030,0,0
2411000347030,93,1

And the other like this:

ProductURL,Product Code,Product Name,More Fields
"http://www.domain.com",4101385360060,Timberland 6 In Buck Boot WHEAT NUBUCK

What I want to do with the first file is remove all records which don't need these conditions:

<All ok>,<!=0>,<=1>

Then with the second file remove all rows which are NOT in the first file.

Any help - hugely appreciated. I suspect that awk is my friend.

Cheers,

Matt.

Upvotes: 1

Views: 175

Answers (3)

Ed Morton
Ed Morton

Reputation: 203209

It sounds like this is what you want:

$ cat tst.awk
BEGIN { FS="," }
FNR==1 { out=FILENAME"_out"; valid[$1] }
NR==FNR && ($2!=0) && ($3==1) { valid[$1] }
$1 in valid { print > out }

$ awk -f tst.awk file1 file2

$ cat file1_out
ProductCode,StockLevel,Active
2411000347030,93,1

$ cat file2_out
Product Code,Product Name

If not, edit your question to clarify your requirements and provide more truly representative sample input and the associated expected output.

Upvotes: 2

Juan Diego Godoy Robles
Juan Diego Godoy Robles

Reputation: 14945

OP commented that the id is not the first part of the second file so one way could be:

awk -F\, 'NR>1&&$2&&$3==1' file1 |tee -a file1_filtered |while IFS=, read -r id dummy; do 
   echo ",$id,"
done > pattern_file
grep -f pattern_file file2 >file2_filtered

Upvotes: 1

anubhava
anubhava

Reputation: 784948

You can use this awk command:

awk -F, 'FNR==NR && $2!=0 && $3==1{a[$1]; next} $1 in a' file1.csv file2.csv

However I suggest you to provide more sample data with expected output in your question.

Upvotes: 2

Related Questions