Reputation: 21443
I would like to count the occurences of missings of every line in a txt file.
foo.txt
file:
1 1 1 1 1 NA # so, Missings: 1
1 1 1 NA 1 1 # so, Missings: 1
1 1 NA 1 1 NA # so, Missings: 2
But I would also like to obtain the amount of elements for the first line (assuming this is equal for all lines).
miss = []
with open("foo.txt") as f:
for line in f:
miss.append(line.count("NA"))
>>> miss
[1, 1, 2] # correct
The problem is when I try to identify the amount of elements. I did this with the following code:
miss = []
with open("foo.txt") as f:
first_line = f.readline()
elements = first_line.count(" ") # given that values are separated by space
for line in f:
miss.append(line.count("NA"))
>>> (elements + 1)
6 # True, this is correct
>>> miss
[1,2] # misses the first item due to readline() removing lines.`
How can I read the first line once without removing it for the further operation?
Upvotes: 0
Views: 362
Reputation: 85442
Provided all lines have the number of items you can just count items in the last line:
miss = []
with open("foo.txt") as f:
for line in f:
miss.append(line.count("NA")
elements = len(line.split())
A better way to count is probably:
elements = len(line.split())
because this also counts items separated with multiple spaces or tabs.
Upvotes: 2
Reputation: 304175
You can also just treat the first line separately
with open("foo.txt") as f:
first_line = next(f1)
elements = first_line.count(" ") # given that values are separated by space
miss = [first_line.count("NA")]
for line in f:
miss.append(line.count("NA")
Upvotes: 0
Reputation: 12755
Try f.seek(0)
. This will reset the file handle to the beginning of the file.
Complete example would then be:
miss = []
with open("foo.txt") as f:
first_line = f.readline()
elements = first_line.count(" ") # given that values are separated by space
f.seek(0)
for line in f:
miss.append(line.count("NA"))
Even better would be to read all lines, even the first line, only once, and checking for number of elements only once:
miss = []
elements = None
with open("foo.txt") as f:
for line in f:
if elements is None:
elements = line.count(" ") # given that values are separated by space
miss.append(line.count("NA"))
BTW: wouldn't the number of elements be line.count(" ") + 1
?
I'd recommend using len(line.split())
, as this also handles tabs, double spaces, leading/trailing spaces etc.
Upvotes: 2