Stephen Lee
Stephen Lee

Reputation: 35

Counting Entries in a File

I am trying to count entries in a text file but having difficulty. The key is that each line is one entry and if the term "ADALIMUMAB" shows up in the line, it counts as one. If it shows up twice, it still should only count as one. Here is an example of lines in the text file.

101700392$10170039$3$I$BUDESONIDE.$BUDESONIDE$1$Oral$9 MG, DAILY$$$$$$$$9$MG$$
101700392$10170039$4$C$ADALIMUMAB$ADALIMUMAB$1$$UNK$$$$$$$$$$$
102117144$10211714$1$PS$HUMIRA$ADALIMUMAB$1$Subcutaneous$$$$$N$ NOT AVAILABLE,NOT

I currently have this working:

fDRUG14Q3  = open("DRUG14Q3.txt")
data = fDRUG14Q3.read()
occurencesDRUG14Q3 = data.count("ADALIMUMAB")

But it will count line 2 in the example above as 2 entries rather than one.

Upvotes: 1

Views: 40

Answers (1)

Mark
Mark

Reputation: 92461

You can use a generator expression passed to sum(). Each line will either be True(1) of False(0) and you'll take the total count. Basically you are counting how many lines return True for 'ADALIMUMAB' in line:

with open(path, 'r') as f:
    total = sum('ADALIMUMAB' in line for line in f)
    
print(total)
# 2

This has the added benefit of not requiring you to read the whole file into memory first too.

Upvotes: 1

Related Questions