ATMA
ATMA

Reputation: 1466

Import .dat file as an array

I have a .dat file that looks like this.

ID_1,5.0,5.0,5.0,... 
ID_2,5.0,5.0,5.0,...

I'm trying to import the data into Python as an array.

If I do this, it'll give me a list of tuples.

data = np.genfromtxt('mydat.dat',
                     dtype=None,
                     delimiter=',')

However, when I do the following it gives an odd result, probably because that first element is not a float.

np.fromfile('mydat.dat', dtype=float)

array([  3.45301146e-086,   3.45300781e-086,   3.25195588e-086, ...,
         8.04331780e-096,   8.04331780e-096,   1.31544776e-259])

Any suggestions on this? These were the two main ways to import .dat files into Python as an array and they don't seem to provide the desired result.

Upvotes: 5

Views: 57112

Answers (2)

Anil_M
Anil_M

Reputation: 11473

Here is one way where we read each line of 'mydat.dat' file , convert each value to str or float and then load to numpy array.

import numpy as np

def is_float(string):
    """ True if given string is float else False"""
    try:
        return float(string)
    except ValueError:
        return False

data = []
with open('mydat.dat', 'r') as f:
    d = f.readlines()
    for i in d:
        k = i.rstrip().split(",")
        data.append([float(i) if is_float(i) else i for i in k]) 

data = np.array(data, dtype='O')

Result

>>> data
array([['ID_1', 5.0, 5.0, 5.0],
       ['ID_2', 5.0, 5.0, 5.0]], dtype=object)

Also, if you can use pandas to read and manipulate data , I would do so. pandas works with much efficiency especially for larger data and is easy to manipulate.

#read data as csv to a dataframe
>>> df = pd.read_csv('mydat.dat', sep=",", header=None)
>>> df
      0    1    2    3
0  ID_1  5.0  5.0  5.0
1  ID_2  5.0  5.0  5.0

#Transposed data with ID numbers as headers
>>> df.T
      0     1
0  ID_1  ID_2
1     5     5
2     5     5
3     5     5
>>> 

Upvotes: 5

ShreyasG
ShreyasG

Reputation: 806

You might want to use numpy loadtext. You can specify formats of different columns.

Upvotes: 3

Related Questions