Linux tools - how to count and list occurrences of regex in file

Question

I have a file with a large number of similar strings. I want to count unique occurrences of a regex, and also show what they were, e.g. for the pattern Profile: (\w*) on the file:

Profile: blah
Profile: another
Profile: trees
Profile: blah

I want to find that there are 3 occurrences, and return the results:

blah, another, trees

jkshah · Accepted Answer

Try this:

egrep "Profile: (\w*)" test.text -o | sed 's/Profile: $\w*$/\1/g' | sort | uniq

Output:

another
blah
trees

Description

egrep with -o option will fetch matching pattern within a file.

sed will only fetch capturing part

sort followed by uniq will give a list of unique elements

To get number of elements in resultant list, append the command with wc -l

egrep "Profile: (\w*)" test.text -o | sed 's/Profile: $\w*$/\1/g' | sort | uniq | wc -l

Output:

Linux tools - how to count and list occurrences of regex in file

Answers (2)

Related Questions