clew
clew

Reputation: 23

Unix Command for counting number of words which contains letter combination (with repeats and letters in between)

How would you count the number of words in a text file which contains all of the letters a, b, and c. These letters may occur more than once in the word and the word may contain other letters as well. (For example, "cabby" should be counted.)

Using sample input which should return 2:

abc abb cabby

I tried both:

grep -E "[abc]" test.txt | wc -l 

grep 'abcdef' testCount.txt | wc -l

both of which return 1 instead of 2.

Thanks in advance!

Upvotes: 2

Views: 833

Answers (3)

srinivasa rao s
srinivasa rao s

Reputation: 1

Try this, it will work

sed 's/ /\n/g' test.txt |grep a |grep b|grep c

$ cat test.txt

abc abb cabby

$ sed 's/ /\n/g' test.txt |grep a |grep b|grep c

abc cabby

hope this helps..

Upvotes: 0

Thor
Thor

Reputation: 47119

I don't think you can get around using multiple invocations of grep. Thus I would go with (GNU grep):

<file grep -ow '\w+' | grep a | grep b | grep c

Output:

abc
cabby

The first grep puts each word on a line of its own.

Upvotes: 1

jaypal singh
jaypal singh

Reputation: 77115

You can use awk and use the return value of sub function. If successful substitution is made, the return value of the sub function will be the number of substitutions done.

$ echo "abc abb cabby" | 
awk '{
    for(i=1;i<=NF;i++) 
    if(sub(/a/,"",$i)>0 && sub(/b/,"",$i)>0 && sub(/c/,"",$i)>0) {
        count+=1
    }
}
END{print count}'
2

We keep the condition of return value to be greater than 0 for all three alphabets. The for loop will iterate over every word of every line adding the counter when all three alphabets are found in the word.

Upvotes: 1

Related Questions