jwillis0720
jwillis0720

Reputation: 4477

Accessing items based off list of indices in iterables

I have about 40 million lines of text to parse through and I want to treat each line as a split string and then ask for multiple slices (or subscripts, whatever they are called) using a list of numbers I generate in a method.

# ...
other_file = open('output.txt','w')
list = [1, 4, 5, 7, ...]
for line in open(input_file):
    other_file.write(line.split(',')[i for i in list])

the subscript can't take this generator I have shown, but I want to ask the split line for multiple entries in it without having to iterate through the list in every line.

I apologize, I know this is a simple answer but I just can't think of it. It's so late!

Upvotes: 1

Views: 145

Answers (3)

dugres
dugres

Reputation: 13095

from operator import itemgetter
from csv import reader, writer

fields = 1,4,5,7

row_filter = itemgetter(*fields)

with open('inp.txt', 'r') as inp:
    with open('out.txt', 'w') as out:
        writer(out).writerows(map(row_filter, reader(inp)))

Upvotes: 1

San4ez
San4ez

Reputation: 8241

CSV module can help you

import csv
reader = csv.reader(open(input_file, 'r'))
writer = csv.writer(open(output_file, 'w'))
fields = (1,4,5,7,...)
for row in reader:
    writer.writerow([row[i] for i in fields])

For further improvements, open files with context managers

Upvotes: 4

John La Rooy
John La Rooy

Reputation: 304177

Don't use list as a variable name - remember there is a builtin called list

other_file = open('output.txt','w')
lst = [1,4,5,7,...]
for line in open(input_file):
    fields = line.split(',')
    other_file.write(",".join(fields[i] for i in lst) + "\n")

For further improvement use context managers to open/close the files for you

Upvotes: 3

Related Questions