Reputation: 60
I have a binary file which I'm able to open in MATLAB, but unable to open in Python. The binary file is encoded as a 'double float,' thus read by MATLAB with the following line:
fread(fopen(fileName), 'float64');
In Python, I'm not really sure how to replicate this line. I thought using Numpy would be a good place to start, so I tried the following lines, but didn't get the output I expected. There are 6 numbers in each line, I only got the first one and a single 'NaN.'
from numpy import *
f = open('filename', 'rb')
a = fromfile(f, double64, 10)
print a
Any help on this would be extremely appreciated; I've posted both the binary and the MATLAB parsed files in the comments below. I don't need to use Numpy specifically either, I'm open to any Python based solution. Thank you.
Upvotes: 2
Views: 3047
Reputation: 15788
Every second value is nan
so this might be some delimiter. Also, the values in the file are column-first. The following script reads in the data, throws away the NaN entries, manipulates the array into the correct shape, and outputs a CSV file which is identical to the one you posted:
import csv
import numpy as np
# Pull in all the raw data.
with open('TEMPO3.2F-0215_s00116.dat', 'rb') as f:
raw = np.fromfile(f, np.float64)
# Throw away the nan entries.
raw = raw[1::2]
# Check its a multiple of six so we can reshape it.
if raw.size % 6:
raise ValueError("Data size not multiple of six.")
# Reshape and take the transpose to manipulate it into the
# same shape as your CSV. The conversion to integer is also
# so the CSV file is the same.
data = raw.reshape((6, raw.size/6)).T.astype('int')
# Dump it out to a CSV.
with open('test.csv', 'w') as f:
w = csv.writer(f)
w.writerows(data)
Edit: Updated version with changes suggested by jorgeca:
import csv
import numpy as np
# Pull in all the raw data.
raw = np.fromfile('TEMPO3.2F-0215_s00116.dat', np.float64)
# Throw away the nan entries.
raw = raw[1::2]
# Reshape and take the transpose to manipulate it into the
# same shape as your CSV. The conversion to integer is also
# so the CSV file is the same.
data = raw.reshape((6, -1)).T.astype('int')
# Dump it out to a CSV.
with open('test.csv', 'w') as f:
w = csv.writer(f)
w.writerows(data)
Upvotes: 6
Reputation: 5073
There is a delimiter between your data values producing alternating data and NaN on reading, for instance in matlab:
NaN
2134
NaN
2129
NaN
2128
....
1678
and with numpy:
[ nan 2134. nan ..., 1681. nan 1678.]
I get the same input using the code you posted either with Matlab or numpy(1.7). Note that the data is read from your dat file column-wise, not row-wise according to the pattern in your csv file.
To get ALL of the data in numpy try
a = fromfile(file=f, dtype=float64, count=-1)
Upvotes: 4