Ron Ancheta
Ron Ancheta

Reputation: 45

ValueError: invalid literal for int() with base 10 Shows up When Trying to Convert String into Int in Python

Im trying to setup my dataset for my linear regression. I can't simply fit the dates in the LinReg model because it needs a numerical value or an int. So Im trying to use int() to convert the string into int. But I get an error - ValueError: invalid literal for int() with base 10

The code:

df = pd.read_csv('data/Customers.csv')

print(int(df.date[0]))

Upvotes: 1

Views: 346

Answers (1)

madbird
madbird

Reputation: 1379

Try parsing date as Timestamp and using toordinal on that:

In [14]: import io, pandas as pd
    ...: 
    ...: text = "date\n01/31/2021\n"
    ...: buff = io.StringIO(text)
    ...: df = pd.read_csv(buff, converters={"date": pd.Timestamp})
    ...: ts = df.date[0]

In [15]: ts
Out[15]: Timestamp('2021-01-31 00:00:00')

In [16]: ts.toordinal()
Out[16]: 737821

In [17]: 

Source:

datetime.toordinal() is a simple method used to manipulate the objects of DateTime class. It returns proleptic Gregorian ordinal of the date, where January 1 of year 1 has ordinal 1. The function returns the ordinal value for the given DateTime object.

If January 1 of year 1 has ordinal number 1 then, January 2 year 1 will have ordinal number 2, and so on.

Upvotes: 2

Related Questions