Reputation: 25381
I'd like to count the instances of 'text' after each 'header'. I'm using grep and awk but open to any tools. My file looks like this:
header1
text1
text2
text3
header2
text1
header3
header4
text1
text2
...
A great output would look like this
header1 3
header2 2
header3 0
header4 2
...
My question is similar to this, but requires not counting the total occurrences and instead the occurrences between a certain string.
Upvotes: 3
Views: 1182
Reputation: 33397
This awk command does not store the entire file in memory:
awk '/^header/{if (head) print head,k;head=$1; k=0}!/^header/{k++}END{print head,k}' file
If you are only interested in counting the lines containing text
, then change the script to this:
awk '/^header/{if (head) print head,k;head=$1; k=0}/text/{k++}END{print head,k}' file
Upvotes: 4
Reputation: 290515
With awk
:
$ awk '{if (/header/) {h=$0; a[h]=0} if (/text/) {a[h]++}} END{for (i in a) {print i" "a[i]}}' file
header1 3
header2 1
header3 0
header4 2
{if (/header/) {h=$0; a[h]=0} if (/text/) {a[h]++}}
fills the array a[]
with the number of matches of each "text" line after each "header" line.END{for (i in a) {print i" "a[i]}}
prints the result after reading the file.Upvotes: 2