Reputation: 361
I have a CSV log file that records the properties HA;HB;HC;HD;HE
. The following file records 6 entries (separated by the above header).
I would like to extract the 3rd property(HC
) of every entry.
HA;HB;HC;HD;HE
a1;b1;14;d;e
HA;HB;HC;HD;HE
a2;b2;28;d;e
HA;HB;HC;HD;HE
a31;b31;44;d;e
a32;b32;07;d;e
HA;HB;HC;HD;HE
a4;b4;0;d;e
HA;HB;HC;HD;HE
a51;b51;32;d;e
a52;b52;0;d;e
a53;b53;5;d;e
HA;HB;HC;HD;HE
a6;b6;10;d;e
Whenever there's n
lines of HC
recorded per entry, I want to extract the addition of the n
entries.
The expected output for the above file:
14
28
51
0
37
10
I know I can write a program for this, but is there an easy way to get this with a combination on awk
and/or sed
commands?
Upvotes: 0
Views: 100
Reputation: 203254
$ awk -F';' '$3=="HC"{if (NR>1) print s; s=0; next} {s+=$3} END{print s}' file
14
28
51
0
37
10
Upvotes: 0
Reputation: 2882
awk solution:
$ awk -F';' '$3=="HC" && p{
print sum # print current total
sum=p=0 # reinitialize sum and p
next
}
$3!="HC"{
sum=sum+($3+0) # make sure $3 is converted to integer. sum it up.
p=1 # set p to 1
} # print last sum
END{print sum}' input.txt
output:
14
28
51
0
37
10
one-liner:
$ awk -F";" '$3=="HC" && p{print sum;sum=p=0;next} $3!="HC"{sum=sum+($3+0);p=1} END{print sum}' input.txt
Upvotes: 1
Reputation: 133458
Could you please try following and let me know if this helps you.
awk -F";" '
/^H/ && $3!="HC"{
flag="";
next
}
/^H/ && $3=="HC"{
if(NR>1){
printf("%d\n",sum)
};
sum=0;
flag=1;
next
}
flag{
sum+=$3
}
END{
printf("%d\n",sum)
}
' Input_file
Output will be as follows.
14
28
51
0
37
10
Upvotes: 0
Reputation: 7922
eval "true || $(cat data.csv|cut -d ";" -f3 |sed -e s/"HC"/"0; expr 0"/g |tr '\n' '@'|sed -e s/"@@"/""/g|sed -e s/"@"/" + "/g)"
Explanation:
cat
cut
delimiter of ;
HC
lines with 0; expr 0
values to start building eval
-worthy bash expressions to eventually yield expr 0 + 14;
\n
newlines temporarily with @
to circumvent possible BSD sed
limitations@@
with single @
to avoid blank lines turning into spaces and causing expr
to bomb out.@
with +
to add the numbers together.true || 0; expr ...
to avoid a guaranteed syntax error on the first line.Which creates this:
true || 0; expr 0 + 14 + 0; expr 0 + 28 + 0; expr 0 + 44 + 07 + 0; expr 0 + 0 + 0; expr 0 + 32 + 0 + 5 + 0; expr 0 + 10
The output looks like this:
14
28
51
0
37
10
This was tested on Bash 3.2 and MacOS El Capitan.
Upvotes: 0
Reputation: 16997
awk -F';' '/^H.*/{if(f)print s;s=0;f=$3=="HC"}f{s+=$3}END{if(f)print s}' infile
For given inputs:
$ cat infile
HA;HB;HC;HD;HE
a1;b1;14;d;e
HA;HB;HC;HD;HE
a2;b2;28;d;e
HA;HB;HC;HD;HE
a31;b31;44;d;e
a32;b32;07;d;e
HA;HB;HC;HD;HE
a4;b4;0;d;e
HA;HB;HC;HD;HE
a51;b51;32;d;e
a52;b52;0;d;e
a53;b53;5;d;e
HA;HB;HC;HD;HE
a6;b6;10;d;e
$ awk -F';' '/^H.*/{if(f)print s; s=0; f=$3=="HC"}f{s+=$3}END{if(f)print s}' infile
14
28
51
0
37
10
It takes little more care for example:
$ cat infile2
HA;HB;HC;HD;HE
a1;b1;14;d;e
HA;HB;HC;HD;HE
a2;b2;28;d;e
HA;HB;HC;HD;HE
a31;b31;44;d;e
a32;b32;07;d;e
HA;HB;HC;HD;HE
a4;b4;0;d;e
HA;HB;HD;HD;HE <---- Say if HC does not found
a51;b51;32;d;e
a52;b52;0;d;e
a53;b53;5;d;e
HA;HB;HC;HD;HE
a6;b6;10;d;e
# find only HC in 3rd column
$ awk -F';' '/^H.*/{if(f)print s; s=0; f=$3=="HC"}f{s+=$3}END{if(f)print s}' infile2
14
28
51
0
10
# Find HD in 3rd column
$ awk -F';' '/^H.*/{if(f)print s; s=0; f=$3=="HD"}f{s+=$3}END{if(f)print s}' infile2
37
Upvotes: 0
Reputation: 17876
I haven't tested this; try it and let me know if it works.
awk -F';' '
$3 == "HC" {
if (NR > 1) {
print sum
sum = 0 }
next }
{ sum += $3 }
END { print sum }'
Upvotes: 1