Reputation: 90
I am very new to linux and the usage of awk and couldn't find an answer for my following question:
I want to use awk and my file is structured like that:
Date ID Size
2016-11-09 688 47
2016-11-09 688 56
2016-11-09 31640 55
Now I want to sum up the size for each line that has the Date and ID and export it to a .csv file. The file should look like that:
Date,ID,Size
2016-11-09,688,103
2016-11-09,31640 55
I really need your help, because I could not figure out how to do it on my own, thank you.
Upvotes: 0
Views: 101
Reputation: 204228
If your input is really sorted by date and ID as in your sample then you should use this:
$ cat tst.awk
BEGIN { OFS="," }
NR==1 { $1=$1; print; next }
{ curr = $1 OFS $2 }
(curr != prev) && (NR > 2) { print prev, sum; sum=0 }
{ prev = curr; sum += $3 }
END { print prev, sum }
$ awk -f tst.awk file
Date,ID,Size
2016-11-09,688,103
2016-11-09,31640,55
rather than saving the whole file in memory. Note that this approach will also produce output in the same order as the input whereas any for .. in ..
loop in an END
section will print the output in random (hash) order.
Upvotes: 2