Reputation: 3

Reading a text file from a certain point (python)

I'm trying to make code that can find a specific word in a file and start reading from there until it reads the same word again. In this case the word is "story". The code counts up the lines until the word, and then it starts counting again from 0 in the second loop. I have tried to use functions and global variables, but I keep getting the same number twice and I don't know why.

file = open("testing_area.txt", "r")
line_count = 0
counting = line_count

for line in file.readlines()[counting:]:
        if line != "\n":
            line_count = line_count + 1
            if line.startswith('story'):
                #line_count += 1
                break
          
print(line_count)

for line in file.readlines()[counting:]:
        if line != "\n":
            line_count = line_count + 1
            if line.startswith('story'):
                #line_count += 1
                break

print(line_count)
file.close()

Output:

6
6

Expected output:

6
3

This is the text file:

text
text
text
text
text
story
text
text
story

Upvotes: 0

Answers (3)

CrazyChucky

Reputation: 3518

There are several issues here. The first is that, for a given file object, readlines() basically only works once. Imagine a text file open in an editor, with a cursor that starts at the beginning. readline() (singular) reads the next line, moving the cursor down one: readlines() (plural) reads all lines from the cursor's current position to the end. Once you've called it once, there are no more lines left to read. You could solve this by putting something like lines = file.readlines() up at the top, and then looping through the resulting list. (See this section in the docs for more info.)

However, you neither reset line_count to 0, nor ever set counting to anything but 0, so the loops still won't do what you intend. You want something more like this:

with open("testing_area.txt") as f:
    lines = f.readlines()

first_count = 0
for line in lines:
    if line != "\n":
        first_count += 1
        if line.startswith('story'):
            break 
print(first_count)

second_count = 0
for line in lines[first_count:]:
    if line != "\n":
        second_count += 1
        if line.startswith('story'):
            break
print(second_count)

(This also uses the with keyword, which automatically closes the file even if the program encounters an exception.)

That said, you don't really need two loops in the first place. You're looping through one set of lines, so as long as you reset the line number, you can do it all at once:

line_no = 0
words_found = 0

with open('testing_area.txt') as f:
    for line in f:
        if line == '\n':
            continue
        line_no += 1
        if line.startswith('story'):
            print(line_no)
            line_no = 0
            words_found += 1
            if words_found == 2:
                break

(Using if line == '\n': continue is functionally the same as putting the rest of the loop's code inside if line != '\n':, but personally I like avoiding the extra indentation. It's mostly a matter of personal preference.)

Upvotes: 1

adamkwm

Reputation: 1173

As the question doesn't said that it only needs to count the word twice, I provide a solution that will read through the whole file and print every time when "story" found.

# Using with to open file is preferred as file will be properly closed
with open("testing_area.txt") as f:
    line_count = 0
    for line in f:
        line_count += 1
        if line.startwith("story"):
            print(line_count)
            # reset the line_count if "story" found
            line_count = 0

Output:

6
3

Upvotes: -1

DarrylG

Reputation: 17156

Code can be simplified to:

with open("testing_area.txt", "r") as file:              # Context manager preferred for file open
    first, second = None, None                           # index of first and second occurance of 'story'
    for line_count, line in enumerate(file, start = 1):  # provides line index and content
        if line.startswith('story'):                     # no need to check separately for blank lines 
            if first is None:
                first = line_count  # first is None, so this must be the first
            else:
                second = line_count  # previously found first, so this is the second
                break                # have now found first & second
       
print(first, second - first)         # index of first occurrence and number of lines between first and second
# Output: 6, 3

Upvotes: 1

Reading a text file from a certain point (python)

Answers (3)

Related Questions