ASH
ASH

Reputation: 20322

Trying to loop through multiple text files and append line 2 to a list

I'm trying to loop through a bunch of text files and append line #2 of each file to a list. Below is my sample code. This looks like it should be pretty close, but nothing at all is getting appended to my list.

import os
directory = 'C:\\my_path\\'
i=0
list2 = []
for filename in os.listdir(directory):
    with open(directory + filename) as infile:
        try:
            print(filename)
            i=i+1
            print(i)
            data = infile.readlines()
            for line in data:
                if (line == 2):
                    list2.append(line)
                    infile.close()
        except:
            print(filename + ' is throwing an error')
print('DONE!!')
print(list2)

Upvotes: 2

Views: 2342

Answers (5)

Corentin Pane
Corentin Pane

Reputation: 4943

When writing:

for line in data:
    if (line == 2):
        list2.append(line)
        infile.close()

The line variable is not the index of the line but the line itself as a string.

Also note that the second line will have an index of 1, not 2 because indexes start at 0 in Python.

You should at least change this loop to:

for index, line in enumerate(data):
    if (index == 1):
        list2.append(line)
        infile.close()

Also, as suggested by @bruno-desthuilliers, you do not need to use the readlines() method which uses memory, instead you can directly iterate on your file like this:

#no infile.readlines() needed
for index, line in enumerate(infile):
    if (index == 1):
        list2.append(line)
        infile.close()

Finally, you do not need to call infile.close() as you're wrapping the statement in a with block. The call is made for you.

Upvotes: 3

qristjan
qristjan

Reputation: 183

Try this version:

import os
directory = 'C:\\my_path\\'

secondLines = []

for filename in os.listdir(directory):
    try:
        #Use open() because it is optimized and does not read the whole file into RAM
        with open(directory + "\\" + filename) as infile:
            for lineIndex, line in enumerate(infile):
                if lineIndex == 1:
                    secondLines.append(line)
    except:
        print(filename + ' is throwing an error')

print(secondLines) 

Your version:

import os
directory = 'C:\\my_path\\'
i=0
list2 = []
for filename in os.listdir(directory):
    #add "\\" to read the correct file
    with open(directory + "\\" + filename) as infile:
        try:
            print(filename)
            i=i+1
            print(i)
            data = infile.readlines()  
            #To get the second line, you have to use indexes
            for line in range(len(data)):
                #if line (index) equals 1, it is the second line (0th element is first)
                if (line == 1):
                    #If the index of the line is 1, append it to the list
                    #data[line] = take the element on index 1 from list data. Indexing starts at 0
                    list2.append(data[line])
                    infile.close()
        except:
            print(filename + ' is throwing an error')
print('DONE!!')
print(list2)

Upvotes: 1

UserOnWeb
UserOnWeb

Reputation: 98

Another elegant way of doing it is following, which takes care of not iterating through all the data as well as opening and closing the file automatically.

# With open should take care of automatic opening and closing of file. You don't need to close it explicitly. 
 with open(directory + filename) as infile:
        try:
            print(filename)
            i=i+1
            print(i)
            skip_count = 0
            line in infile:
                 skip_count += 1
                 if skip_count == 2:
                    list2.append(line)
                    break # This will go out of loop and you don't have to iterate through all the data
        except:
            print(filename + ' is throwing an error')

Upvotes: 0

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

line == 2 in your code tries to compare a text/string with number 2 which won't help to catch the ordinal number of the line being read.

Instead, just skip the 1st line and append the next one to the resulting list.

Note:

  • no need to read all lines infile.readlines() if you only need the 2nd line!
  • no need to close the file handler when using context manager with ...

import os

directory = 'C:\\my_path\\'
list2 = []
for filename in os.listdir(directory):
    with open(directory + filename) as infile:
        try:
            print(filename)
            next(infile)
            list2.append(next(infile))
        except:
            print(filename + ' is throwing an error')
print('DONE!!!')
print(list2)

Upvotes: 1

andreashhp
andreashhp

Reputation: 485

When you test if line == 2, you are asking wether the line you read from infile is equal to 2 (which it never is). Instead, you need some counter to test if you are at line 2. Or even better, just index into it:

data = infile.readlines()
list2.append(data[1])         # the line at index 1 is the second line

Upvotes: 2

Related Questions