Reputation: 2763
I need to read in a .txt file, find the row where labels are, return a list (or other iterable) of those labels plus the index of the next line. In this particular program, I use it the first time to open the file and return the labels (which are consistent) and the index of the next line for the purpose of identifying what to open with np.genfromtxt. The subsequent uses are just to determine the index only.
Sometimes, the technician will put an extra carriage return in when entering test parameters and it results in an extra blank line. When that happens, I get an empty set instead of labels. In TFM it seems that csv.reader takes that blank line as EOF, but I don't see how to tell it to keep checking.
Is there a way to make it do that? Is there a better way to accomplish what I want?
def get_labels(filename):
index = 0
with open(filename, 'rb') as f:
dialect = csv.Sniffer().sniff(f.read())
f.seek(0)
reader = csv.reader(f, dialect)
for row in reader:
if 'TimeStamp (s)' not in row:
index += 1
else:
return row, index + 1
Update: I'm trying to figure out the strip function, but I think this is clunky and not the way to go. Here's what I've tried so far:
def strip(filename):
with open(otherfile, 'wb') as o:
with open(filename, 'rb') as f:
for line in f:
if line == '\n':
continue
else:
o.write(line)
f.close()
o.close()
return o
Upvotes: 0
Views: 4051
Reputation: 77337
The quick way to solve the problem is a second function that strips empty lines. You can use itertools.ifilter
to do the job:
import itertools
def get_labels(filename):
index = 0
with open(filename, 'rb') as f:
sample = ''.join(x[0] for x in zip(itertools.ifilter(strip, f), range(4)))
dialect = csv.Sniffer().sniff(sample)
f.seek(0)
reader = csv.reader(itertools.ifilter(strip, f), dialect)
for row in reader:
if 'TimeStamp (s)' not in row:
index += 1
else:
return row, index + 1
You could write your own strip function instead of using filter:
def strip_lines(iterable, maxlines=None):
for i, line in enumerate(iterable):
if line.strip() and (maxlines is None or maxlines > i):
yield line
def get_labels(filename):
index = 0
with open(filename, 'rb') as f:
dialect = csv.Sniffer().sniff(''.join(strip_lines(f, 4))
f.seek(0)
reader = csv.reader(strip_lines(f), dialect)
for row in reader:
if 'TimeStamp (s)' not in row:
index += 1
else:
return row, index + 1
Upvotes: 1