Reputation: 795
I want to extract all doubles/floats from a file. Any line looks like:
0 324.609 1 -39475.435 2 23.439 3 983.098
4 -4384.698 5 9475.405 6 2398.349 7 9800.138
...
Right now, I am building lists out of columns:
y1 = [ line.split()[1] for line in data]
y2 = [ line.split()[3] for line in data]
y3 = [ line.split()[5] for line in data]
y4 = [ line.split()[7] for line in data]
However, the index goes out of range if there is no column 7. How do I prevent this? Also, is there a better way of extracting all double (with the -
sign) from a file?
Thank you.
Upvotes: 0
Views: 342
Reputation: 57033
You can spare yourself from the misery of parsing a mal-formatted data file by using Pandas. In the following example, I assume that the second line of the file does not have the last two columns:
import pandas as pd
data = pd.read_table("yourfile.dat", sep='\s+', header=None, index_col=None)
# 0 1 2 3 4 5 6 7
#0 0 324.609 1 -39475.435 2 23.439 3.0 983.098
#1 4 -4384.698 5 9475.405 6 2398.349 NaN NaN
y1 = data[1].dropna().tolist()
y2 = data[3].dropna().tolist()
y3 = data[5].dropna().tolist()
y4 = data[7].dropna().tolist()
y4
#[983.0980000000001]
Upvotes: 2
Reputation: 1704
You can use a try/except block when iterating over each line.
y7 = []
for line in data:
try:
y7.append(float(line.split()[7]))
except:
pass
If there is no seventh column, then it won't give you an error.
If you want to keep the order of each number (for example if you want every element in the 7th row to be the 7th elements of your lists), then you could append np.nan to your list:
y7 = []
for line in data:
try:
y7.append(float(line.split()[7]))
except:
y7.append(np.nan)
Upvotes: 0
Reputation: 795
To save alternate columns, generate a list of odd numbers.
L = list(range(10))
y1 = []
for lines in data:
line = lines.split()
n = len(line)
l = L[1:n:2]
for i in l:
y1.append(line[i])
print y1
y1 is a list of all numbers in odd columns.
Upvotes: 0