Reputation: 4477
I have about 40 million lines of text to parse through and I want to treat each line as a split string and then ask for multiple slices (or subscripts, whatever they are called) using a list of numbers I generate in a method.
# ...
other_file = open('output.txt','w')
list = [1, 4, 5, 7, ...]
for line in open(input_file):
other_file.write(line.split(',')[i for i in list])
the subscript can't take this generator I have shown, but I want to ask the split line for multiple entries in it without having to iterate through the list in every line.
I apologize, I know this is a simple answer but I just can't think of it. It's so late!
Upvotes: 1
Views: 145
Reputation: 13095
from operator import itemgetter
from csv import reader, writer
fields = 1,4,5,7
row_filter = itemgetter(*fields)
with open('inp.txt', 'r') as inp:
with open('out.txt', 'w') as out:
writer(out).writerows(map(row_filter, reader(inp)))
Upvotes: 1
Reputation: 8241
CSV module can help you
import csv
reader = csv.reader(open(input_file, 'r'))
writer = csv.writer(open(output_file, 'w'))
fields = (1,4,5,7,...)
for row in reader:
writer.writerow([row[i] for i in fields])
For further improvements, open files with context managers
Upvotes: 4
Reputation: 304177
Don't use list
as a variable name - remember there is a builtin called list
other_file = open('output.txt','w')
lst = [1,4,5,7,...]
for line in open(input_file):
fields = line.split(',')
other_file.write(",".join(fields[i] for i in lst) + "\n")
For further improvement use context managers to open/close the files for you
Upvotes: 3