Ben Lynch
Ben Lynch

Reputation: 61

Python: numpy.loadtxt invalid literal for float() error parsing data

    import matplotlib 
    import matplotlib.pyplot as plt
    import matplotlib.ticker as mticker
    import matplotlib.dates as mdates
    import numpy as np
    import time

    month,day,year,time,price = np.loadtxt('spy_testdata.txt', delimiter=' ')

Above is my code. I am getting the following error:

ValueError: invalid literal for float(): 9:30

a sample of the file I am trying to parse is:

8 18 2014 9:30 196.79

This is one minute tick data for SPY. It looks like it is getting hung up trying to parse the "time" column. I know it has to do with the colon in the value, but I don't know what the work around is to allow me to read in that data.

Upvotes: 2

Views: 7579

Answers (2)

hpaulj
hpaulj

Reputation: 231385

np.genfromtxt with a dtype=None will try to deduce the field types.

In [630]: txt=['8 18 2014 9:30 196.79']
In [631]: np.genfromtxt(txt,delimiter=' ',dtype=None)
Out[631]: 
array((8, 18, 2014, '9:30', 196.79), 
      dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4'), ('f3', 'S4'), ('f4', '<f8')])

In this case it finds 3 columns of ints, a string, and a float. And as a fallback it makes '9:30' a string. The result is a structured array.

You could also specify the column types, eg.

dt='i4,i4,i4,a4,f8'
np.loadtxt(txt,delimiter=' ',dtype=dt)

I think you'll have to parse the 9:30 on your own.

A more elaborate dtype specification would be:

dt=np.dtype([('month',int,),
             ('day',int,),
             ('year',int,),
             ('time','a5'),
             ('price',float)])
result=np.loadtxt(txt,delimiter=' ',dtype=dt)
result['price']
# array(196.79)

You probably could combine 4 of those fields into one np.datetime64 field, but that's another question.

Upvotes: 0

ivarsj10s
ivarsj10s

Reputation: 103

The reason why you're having such a problem is because 9:30 is not a valid thing in Python. As a workaround, you could simply open .txt

myFile = open("myText.txt", "r")

if it's a one-line text file, you can then

myString = myFile.read()

myList = myString.split(' ')

and string you can turn into a list, list you can edit to change 9:30 to 9.30 or 30/60 (as numberical base 60) and covert it into 0.5 in decimal.

or

you could just use different time input with like minutes and seconds separated or something like that

Upvotes: 1

Related Questions