Reputation:
In my csv file I have time column with three data column. I need to convert time into float using panda. But it gives me an error ,invalid literal for int() with base 10: 'g' Can you please suggest me to solve this error? My code is,
def time_to_float(t):
""" convert "hh:mm:ss" to float (0, 1) only of the correct format """
if t == '-':
return None
a = [int(i) for i in t.split(":")]
if len(a) == 3:
return round((a[0] + a[1] / 60 + a[2] / 3600) / 24, 5)
else:
return t
def pick_column(data_, n, start=1):
""" pick all the n'th column data starting from "start" """
return [time_to_float(data_[i][n]) for i in range(start, len(data_))]
data = pd.read_csv('data4.csv')
data = [i for i in data]
Time = pick_column(data, 0)
g = pick_column(data, 1)
p = pick_column(data, 2)
c = pick_column(data, 3)
y = pick_column(data, 4)
print(Time)
print(g)
print(p)
print(c)
print(y)
my data set is
Time g p c y
0:06:15 141 NaN NaN 141
0:08:00 NaN 10 NaN 117
0:09:00 NaN 15 NaN 103
0:09:25 95 NaN NaN 95
0:09:30 NaN NaN 50 93
Upvotes: 1
Views: 236
Reputation: 26169
Normally you would do something like
t = df[df.columns[0]].astype('int64') / 1e9
print(t)
to convert the whole first column. If you only have strings in your table, you need to convert to dates first, something like
timecol = df.columns[0]
df[timecol] = pd.to_datetime(df[timecol])
and then run the first snippet.
Upvotes: 0
Reputation: 1607
I think you need this
this is your sample Time
print(df['Time'])
1:06:15
To convert this into seconds per day basis you can do like this
df['TimeFloat'] = (pd.DatetimeIndex(df['Time']).astype(np.int64)/10**9)%86400
Taking modulus of 86400 is used because in one day there are 86400 seconds
You can modify the modulus value according to your conversions (seconds, minute, milliseconds)
Also if you need conversion in int
, you can simply use //
instead of /
Final df would be this
Time TimeFloat
1:06:15 3975.0
Upvotes: 1