Reputation: 11294
Is there an inbuilt command to do this or has anyone had any luck with a script that does it?
I am looking to count the number of times a certain string (not word) appears in a file. This can include multiple occurrences per line so the count should count every occurrence not just count 1 for lines that have the string 2 or more times.
For example, with this sample file:
blah(*)wasp( *)jkdjs(*)kdfks(l*)ffks(dl
flksj(*)gjkd(*
)jfhk(*)fj (*) ks)(*gfjk(*)
If I am looking to count the occurrences of the string (*)
I would expect the count to be 6, i.e. 2 from the first line, 1 from the second line and 3 from the third line. Note how the one across lines 2-3 does not count because there is a LF character separating them.
Update: great responses so far! Can I ask that the script handle the conversion of (*)
to \(*\)
, etc? That way I could just pass any desired string as an input parameter without worrying about what conversion needs to be done to it so it appears in the correct format.
Upvotes: 5
Views: 15492
Reputation: 1665
You can use basic grep
command:
If you want to find the number of occurrence of "hello" word in a file
grep -c "hello" filename
If you want to find the number of occurrence of a pattern then
grep -c -P "Your Pattern"
Pattern example: hell.w, \d+
etc
Upvotes: 0
Reputation: 17
I have used below command to find particular string count in a file
grep search_String fileName|wc -l
Upvotes: -1
Reputation: 13942
This loops over the lines of the file, and on each line finds all occurrences of the string "(*)". Each time that string is found, $c is incremented. When there are no more lines to loop over, the value of $c is printed.
perl -ne'$c++ while /\(\*\)/g;END{print"$c\n"}' filename.txt
Update: Regarding your comment asking that this be converted into a solution that accepts a regex as an argument, you might do it like this:
perl -ne'BEGIN{$re=shift;}$c++ while /\Q$re/g;END{print"$c\n"}' 'regex' filename.txt
That ought to do the trick. If I felt inclined to skim through perlrun again I might see a more elegant solution, but this should work.
You could also eliminate the explicit inner while loop in favor of an implicit one by providing list context to the regexp:
perl -ne'BEGIN{$re=shift}$c+=()=/\Q$re/g;END{print"$c\n"}' 'regex' filename.txt
Upvotes: 2
Reputation: 161614
You can use basic tools such as grep
and wc
:
grep -o '(\*)' input.txt | wc -l
UPDATE:
grep -o -F '(*)' input.txt | wc -l
Add the
-F
option to interpret PATTERNS as fixed strings, not regular expressions.
Upvotes: 24
Reputation: 22428
text="(\*)"
grep -o $text file | wc -l
You can make it into a script which accepts arguments like this:
script count:
#!/bin/bash
text="$1"
file="$2"
grep -o "$text" "$file" | wc -l
Usage:
./count "(\*)" file_path
Upvotes: -2
Reputation: 67900
Using perl's "Eskimo kiss" operator with the -n
switch to print a total at the end. Use \Q...\E
to ignore any meta characters.
perl -lnwe '$a+=()=/\Q(*)/g; }{ print $a;' file.txt
Script:
use strict;
use warnings;
my $count;
my $text = shift;
while (<>) {
$count += () = /\Q$text/g;
}
print "$count\n";
Usage:
perl script.pl "(*)" file.txt
Upvotes: 6