nano_horizon
nano_horizon

Reputation: 1

Python/Pandas: TypeError: float() argument must be a string or a number, not 'function'

I am trying to generate a plot from two columns in a .csv file. The column for the x-axis is in the short date format mm/dd/yyyy while the column for the y-axis corresponds to absorption measurement data as regular numerical values. From this, I am also trying to gather a linear regression line from this plot. Here is what I have so far:

mydateparser = lambda x: datetime.strptime(x, '%m/%d/%y')

df = (pd.read_csv('calibrationabs200211.csv', index_col=[], parse_dates=[0],
                  infer_datetime_format=True, date_parser=mydateparser))

if mydateparser == '%m/%d/%y':
    print('Error')
else:
    mydateparser = float(mydateparser)

plt.figure(figsize=(15,7.5))

x = df.iloc[:, 0].values.reshape(-1, 1)
y = df.iloc[:, 1].values.reshape(-1, 1)
linear_regressor = LinearRegression()
linear_regressor.fit(x, y)
y_pred = linear_regressor.predict(y)

plt.scatter(x, y, color='teal')
plt.plot(x, y_pred, color='teal')

plt.show()

However, I am getting an error message:

TypeError                                 Traceback (most recent call last)
<ipython-input-272-d087bdc00150> in <module>
     12     print('Error')
     13 else:
---> 14     mydateparser = float(mydateparser)
     15 
     16 plt.figure(figsize=(15,7.5))

TypeError: float() argument must be a string or a number, not 'function'

Furthermore, if I comment-out the If Statement, I end up getting a plot, but with a faulty linear regression. I am fairly new to python, matplotlib, and pandas so any help or feedback is greatly appreciated. Thank you!

Upvotes: 0

Views: 1866

Answers (3)

Saedeas
Saedeas

Reputation: 1578

Functions in Python can be used as variables, which is what you are doing here. If you want to use the result of a function for something, you need to call it by adding () after the function name.

mydateparser is a function, mydateparser() is the result of calling that function.

Additionally, I don't think the comparison you're making makes sense. datetime.strptime returns a datetime object, which you are later comparing to a string. I'm actually not sure what you're trying to accomplish with that block at all.

Your regression needs the dates to be converted to some sort of numeric value to regress against. I would suggest using matplotlib's date conversion functions, specifically date2num, to try this.

Should be something along the lines of:

from matplotlib import dates
...
x = df[0].apply(dates.date2num)

Upvotes: 1

Asetti sri harsha
Asetti sri harsha

Reputation: 989

At the start of the code, you declared mydateparser as lambda function. But float() function only accepts strings or numbers.

I assume you are using date column as a feature for linear regression model which doesn't make sense.

Instead, you can derive new features like month,year,date,weekday/weekend to be used for linear regression.

If you are looking to predict the value for next dates, you can look at time series forcasting models.

Upvotes: 0

user9611261
user9611261

Reputation:

You have to call the lambda for it to function.

Upvotes: 0

Related Questions