Reputation: 45
Im trying to setup my dataset for my linear regression. I can't simply fit the dates in the LinReg model because it needs a numerical value or an int. So Im trying to use int()
to convert the string into int. But I get an error - ValueError: invalid literal for int() with base 10
The code:
df = pd.read_csv('data/Customers.csv')
print(int(df.date[0]))
Upvotes: 1
Views: 346
Reputation: 1379
Try parsing date as Timestamp and using toordinal on that:
In [14]: import io, pandas as pd
...:
...: text = "date\n01/31/2021\n"
...: buff = io.StringIO(text)
...: df = pd.read_csv(buff, converters={"date": pd.Timestamp})
...: ts = df.date[0]
In [15]: ts
Out[15]: Timestamp('2021-01-31 00:00:00')
In [16]: ts.toordinal()
Out[16]: 737821
In [17]:
datetime.toordinal() is a simple method used to manipulate the objects of DateTime class. It returns proleptic Gregorian ordinal of the date, where January 1 of year 1 has ordinal 1. The function returns the ordinal value for the given DateTime object.
If January 1 of year 1 has ordinal number 1 then, January 2 year 1 will have ordinal number 2, and so on.
Upvotes: 2