Mejdi Dallel
Mejdi Dallel

Reputation: 632

Count number of lines that contain numbers after each string in a text file

I want to count the number of lines that contains numbers after each string. If there are consecutive strings then 0 will be affected to the first string and so on.

For example let's say I have this text file :

some text
120 130
1847 1853
other text
207 220
text
306 350
text with no numbers after
some other text
400 435
900 121
125 369

My output will be like :

2
1
1
0
3

I have a directory containing files and I want to store the results of each file in a list so I will a have a list of lists.

Here's what I have tried :

nb=[]  
c = 0
for filename in sorted(os.listdir("Path_to_txt_files")):
        with open(filename ,'r') as f:
            for line in f:
                if line.strip().replace(" ", "").isdigit(): 
                    c+=1                        
                    nb.append(c)
                else:
                    c=0
                    nb.append(c)

But this is giving me a wrong result. How can I code that ?

Upvotes: 2

Views: 238

Answers (3)

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

Effectively with generator function:

def count_digit_lines(filename):
    with open(filename) as f:
        str_cnt = num_cnt = 0

        for line in f:
            line = line.strip().replace(" ", "")
            if not line:    # skip empty lines
                continue
            if not line.isdigit():   # catch non-digit line
                if num_cnt >= 1:
                    yield num_cnt
                    str_cnt = num_cnt = 0
                str_cnt += 1
            else:
                if str_cnt > 1:
                    yield 0
                    str_cnt = 0
                num_cnt += 1
        if num_cnt:    # check trailing digit lines
            yield num_cnt
        elif str_cnt:
           yield 0


res = []
for fname in sorted(os.listdir("Path_to_txt_files")):
    gen = count_digit_lines(fname)  # generator
    res.append(list(gen))

print(res)

Sample output for a single file would be:

[[2, 1, 1, 0, 3]]

Upvotes: 1

Rakesh
Rakesh

Reputation: 82765

Using itertools.groupby

Ex:

from itertools import groupby


result = []
with open(filename) as infile:
    for k, v in groupby(infile.readlines(), lambda x: x[0].isalpha()):
        value = list(v)
        if k and len(value) > 1:
            result.append(0)
        if not k:
            result.append(len(value))

Output:

[2, 1, 1, 0, 3]

Edit as per comment

result = []
for filename in list_files:
    temp = []
    with open(filename) as infile:
        for k, v in groupby(infile.readlines(), lambda x: x[0].isalpha()):
            value = list(v)
            if k and len(value) > 1:
                temp.append(0)
            if not k:
                temp.append(len(value))
    result.append(temp)

Upvotes: 0

Sayandip Dutta
Sayandip Dutta

Reputation: 15872

You can do it like this:

# sample file
f = '''some text
120 130
1847 1853
other text
207 220
text
306 350
text with no numbers after
some other text
400 435
900 121
125 369'''

lines = f.split('\n')

line2write = []

for line in lines:
    if not line[0].isdigit():
        line2write.append(0)
    else:
        line2write[-1] += 1
print(line2write)

Output:

[2, 1, 1, 0, 3]

Now you can write it how you like.

Upvotes: 0

Related Questions