pedwards
pedwards

Reputation: 413

How can I selectively place line from File into List

I'm trying to take a file with a format like:

# Comments
# More comments

1,foo,bar,1
1,foo,bar,2
21,foo,bar,8

end_of_file

and process it into a list like:

listing = [[1,'foo','bar',1], [1,'foo','bar',2], [21,'foo','bar',8]]

Currently, I'm running something similar to:

listing = [] 
with open('foo_file.cfg','r') as f:
    for line in f:
        if line[0].isDigit:
            listing.append(line)   #  i've also tried listing.append([line])

Obviously, I'm ending up with:

[['1,foo,bar,1'],['1,foo,bar,2'],['21,foo,bar,8']]

I know I can split the line by comma, rebuild a new list, then append the list to listing. I'm definitely willing to do that if it's the proper way, but I thought their might be something cleaner. I know the csv module would read the whole file into a proper format, but I'm not sure how it would deal with selectively removing certain data, such as the comments.

Upvotes: 1

Views: 44

Answers (4)

Kasravnd
Kasravnd

Reputation: 107287

One Pythonic approach is to use itertools.dropwhile() to ignore the first lines that meet a certain condition. Since csv.reader objects are iterator, this will no longer require reading the whole file once and then looping over the lines again for filtering them out. You can also remove the empty lines simply by checking the validation of the rows (not(x) in lambda function.)

import csv
from itertools import dropwhile

with open('test.csv') as f:
    reader = dropwhile(lambda x: not(x) or x[0].startswith('#'), csv.reader(f))

# print(list(reader))
# [['1', 'foo', 'bar', '1'], ['1', 'foo', 'bar', '2'], ['21', 'foo', 'bar', '8']]

Upvotes: 1

jpp
jpp

Reputation: 164613

This is one way with csv module, which avoids explicitly accounting for some of the repetitive tasks (comma delimiter, new line, etc).

from io import StringIO
import csv

mystr = StringIO("""1,foo,bar,1
1,foo,bar,2
21,foo,bar,8""")

res = []

# replace mystr with open('file.csv', 'r')
with mystr as f:
    reader = filter(None, csv.reader(f))  # ignore empty lines
    for line in reader:
        if line[0].isdigit():
            res.append([int(line[0]), line[1], line[2], int(line[3])])

print(res)

[[1, 'foo', 'bar', 1],
 [1, 'foo', 'bar', 2],
 [21, 'foo', 'bar', 8]]

Upvotes: 2

kosnik
kosnik

Reputation: 2424

If the last line is the only one you want to get rid of you could use pandas.read_csv using either error_bad_lines=False property or skipfooter=1

If it is necessary to loop through the lines of the file and check which line to import then I would just change the line that you append to the listing list to

listing.append(line.split(','))

Upvotes: 1

Austin
Austin

Reputation: 26039

You could do this in similar way without any module:

lst = []
for line in f:
    if not line.startswith('#') and line:
        lst.append([int(i) if i.isdigit() else i for i in line.split(',')])

print(lst)

# [[1, 'foo', 'bar', 1], [1, 'foo', 'bar', 2], [21, 'foo', 'bar', 8]]                                                

Upvotes: 1

Related Questions