Count occurrence of list of words in multiple files

Question

I have a .txt with a list of words (gene names, separated with newlines) and I want to count their occurrences in multiple files in multiple folders.

Folders are like this : MainFolder/family_ID/variants/FILE.table

One folder for each family.

I tried with grep, it does count but it outputs one line per file :

WDFY3 0
WDFY3 0
WDFY3 1
WDFY3 0
WDFY3 0
KMT2C 1
KMT2C 0
KMT2C 0
KMT2C 0
KMT2C 0

I want it that way :

WDFY3 1
KMT2C 1

Here's the code I used :

while read p; do
    grep -FRchi "$p" --include \*.FILE.table | sed "s/^/$p /" >> /MyData/MainFolder/count.txt
done < /MyData/Resources/gene_list.txt

Is it possible with grep? Should I use awk/sed?

Thank you

Raman Sailopal · Accepted Answer

Take the output from you script and pipe it to

awk '{ arry[$1]+=$2 } END { for (i in arry) { print i" "arry[i] } }'

Answers (2)