Reputation: 379
I'm trying to read in a file but it's looking really awkward because each of the spaces between columns is different. This is what I have so far:
with open('sextractordata1488.csv') as f:
#getting rid of title, aka unusable lines:
for _ in xrange(15):
next(f)
for line in f:
cols = line.split(' ')
#9 because it's 9 spaces before the first column with real data
print cols[10]
I looked up how to do this and saw tr and sed commands that gave syntax errors when I attempted them, plus I wasn't really sure where in the code to put them (in the for loop or before it?). I want to reduce all the spaces between columns to one space so that I can consistently get the one column without issues (at the moment because it's a counter column from 1 to 101 I only get 10 through 99 and a bunch of spaces and parts from other columns in between because 1 and 101 have a different number of characters, and thus a different number of spaces from the beginning of the line).
Upvotes: 1
Views: 4096
Reputation: 308
cols = line.split()
should be sufficient
>> "a b".split()
['a', 'b']
Upvotes: 0
Reputation: 1123840
Just use str.split()
without an argument. The string is then split on arbitrary width whitespace. That means it doesn't matter how many spaces there are between non-whitespace content anymore:
>>> ' this is rather \t\t hard to parse without\thelp\n'.split()
['this', 'is', 'rather', 'hard', 'to', 'parse', 'without', 'help']
Note that leading and trailing whitespace are removed as well. Tabs, spaces, newlines, and carriage returns are all considered whitespace.
For completeness sake, the first argument can also be set to None
for the same effect. This is helpful to know when you need to limit the split with the second argument:
>>> ' this is rather \t\t hard to parse without\thelp\n'.split(None)
['this', 'is', 'rather', 'hard', 'to', 'parse', 'without', 'help']
>>> ' this is rather \t\t hard to parse without\thelp\n'.split(None, 3)
['this', 'is', 'rather', 'hard to parse without\thelp\n']
Upvotes: 4