Reputation: 23
I have a file which contains this data:
>P136
FCF#0.73
FCF#0.66
FCF#0.86
>P129
FCF#0.72
>P142
>P144
>P134
FCF#0.70
FCF#0.82
And I need to count the number of lines after a line containing ">" , but keeping the ">" line as reference, for this example the output should be:
>P136 3
>P129 1
>P134 2
Any ideas?
Upvotes: 2
Views: 1157
Reputation: 13869
This is a simple solution that attempts to be minimalistic.
with open(filename) as f:
def printcc(current, count):
if current is not None and count > 0:
print(current.strip(), count)
current = None
count = 0
for line in f:
if line[0] == '>':
printcc(current, count)
current = line
count = 0
else:
count += 1
printcc(current, count)
In case you actually want all lines that contain a >
character, use '>' in line
as your condition. If you're targeting Python 2.x, use print current.strip(), count
because having the outer parentheses will print a two-tuple.
Upvotes: 0
Reputation: 17506
In one line, just to show that we can:
s=""">P136
FCF#0.73
FCF#0.66
FCF#0.86
>P129
FCF#0.72
>P142
>P144
>P134
FCF#0.70
FCF#0.82
"""
First variant:
print [(i.split("\n")[0],len(i.split("\n")[1:])-1) for i in s.split(">")if i if len(i.split("\n")[1:])-1>0]
using re:
import re
print [ (block.split("\n")[0],sum(1 for m in re.finditer("#", block)))for block in s.split(">")]
Upvotes: 0
Reputation: 1121514
Use dictionary to store the count per line, and every time there is no >
at the start, increment the count:
counts = {}
current = None
with open(filename) as fo:
for line in fo:
if line.startswith('>'):
current = line.strip()
counts[current] = 0
else:
counts[current] += 1
then simply loop and print the counts:
for entry, count in counts.items():
print('{} {:2d}'.format(entry, count))
You could even just print the number every time you find a new section:
count = 0
current = None
with open(filename) as fo:
for line in fo:
if line.startswith('>'):
if current and count:
print('{} {:2d}'.format(entry, count))
current = line.strip()
counts = 0
else:
count += 1
if current and count:
print('{} {:2d}'.format(entry, count))
but you cannot then easily re-purpose the counts for other work.
Upvotes: 1