Reputation: 35790
Stack Overflow already has some great posts about counting occurrences of a string (eg. "foo"), like this one: count all occurrences of string in lots of files with grep. However, I've been unable to find an answer to a slightly more involved variant.
Let's say I want to count how many instances of "foo:[*whatever*]*whatever else*
" exist in a folder; I'd do:
grep -or 'foo:[(.*)]' * | wc -l
and I'd get back "55" (or whatever the count is). But what if I have a file like:
foo:bar abcd
foo:baz efgh
not relevant line
foo:bar xyz
and I want to get count how many instances of foo:bar
vs. how many of foo:baz
s, etc.? In other words, I'd like output that's something like:
bar 2
baz 1
I assume there's some way to chain grep
s, or use a different command from wc
, but I have no idea what it is ... any shell scripting experts out there have any suggestions?
P.S. I realize that if I knew the set of possible sub-strings (ie. if I knew there was only "foo:bar" and "foo:baz") this would be simpler, but unfortunately there set of "things that can come after foo:
" is unknown.
Upvotes: 3
Views: 5570
Reputation: 655239
You could use sort
and uniq -c
:
$ grep -orE 'foo:(.*)' * | sort | uniq -c
2 foo:bar
1 foo:baz
Upvotes: 7