Sphere123
Sphere123

Reputation: 23

Count lines after line with specific character

I have a file which contains this data:

>P136
FCF#0.73
FCF#0.66
FCF#0.86
>P129
FCF#0.72
>P142
>P144
>P134
FCF#0.70
FCF#0.82

And I need to count the number of lines after a line containing ">" , but keeping the ">" line as reference, for this example the output should be:

>P136 3
>P129 1
>P134 2

Any ideas?

Upvotes: 2

Views: 1157

Answers (3)

Shashank
Shashank

Reputation: 13869

This is a simple solution that attempts to be minimalistic.

with open(filename) as f:
    def printcc(current, count):
        if current is not None and count > 0:
            print(current.strip(), count)
    current = None
    count = 0
    for line in f:
        if line[0] == '>':
            printcc(current, count)
            current = line
            count = 0
        else:
            count += 1
    printcc(current, count)

In case you actually want all lines that contain a > character, use '>' in line as your condition. If you're targeting Python 2.x, use print current.strip(), count because having the outer parentheses will print a two-tuple.

Upvotes: 0

Sebastian Wozny
Sebastian Wozny

Reputation: 17506

In one line, just to show that we can:

s=""">P136
FCF#0.73
FCF#0.66
FCF#0.86
>P129
FCF#0.72
>P142
>P144
>P134
FCF#0.70
FCF#0.82
"""

First variant:

print [(i.split("\n")[0],len(i.split("\n")[1:])-1) for i in s.split(">")if i if len(i.split("\n")[1:])-1>0]

using re:

import re
print [ (block.split("\n")[0],sum(1 for m in re.finditer("#", block)))for block in s.split(">")]

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1121514

Use dictionary to store the count per line, and every time there is no > at the start, increment the count:

counts = {}
current = None

with open(filename) as fo:
   for line in fo:
       if line.startswith('>'):
           current = line.strip()
           counts[current] = 0
       else:
           counts[current] += 1

then simply loop and print the counts:

for entry, count in counts.items():
    print('{} {:2d}'.format(entry, count))

You could even just print the number every time you find a new section:

count = 0
current = None

with open(filename) as fo:
   for line in fo:
       if line.startswith('>'):
           if current and count:
               print('{} {:2d}'.format(entry, count))
           current = line.strip()
           counts = 0
       else:
           count += 1
   if current and count:
       print('{} {:2d}'.format(entry, count))

but you cannot then easily re-purpose the counts for other work.

Upvotes: 1

Related Questions