Reputation: 123
I need to use the line of best to predict a value in my datas frame. How would I do this? Is there a function that for example, I can input a year and be given a predicted value for life expectancy?
Year Life Expectancy
1930 59.7
1940 62.9
1950 70.2
1965 67.7
How would I calculate a value for the year 1948?
Upvotes: 1
Views: 4681
Reputation: 140
You can use :
import seaborn as sns
sns.lmplot(data['Year'],data['Life Expectancy'],data)
This would fit a straight line for your given data according to linear regression and you could also figure out any other values such as for year 1948 etc.
For documentation refer : https://seaborn.pydata.org/generated/seaborn.lmplot.html
Upvotes: 2
Reputation: 1284
As I had a bit of time, for fun a complete example based on @ALollz comment, using numpy.polyfit()
and .polyval()
.
% matplotlib inline
import pandas as pd
import numpy as np
# Generate some test data with a trend.
data = pd.DataFrame(
{
'year': list(range(1900, 2000)),
'life_exp': np.linspace(50, 80, 100) * ((np.random.randn(100, ) * 0.1) + 1)
}
)
data[['life_exp']].plot()
Giving:
# Fit coefficents.
coef = np.polyfit(data['year'], data['life_exp'], 1)
# Generate predictions for entire series.
data['predicted'] = pd.Series(np.polyval(coef, data['year']))
data[['life_exp', 'predicted']].plot()
Which gives us the result we want:
And we can predict a single year:
# Passing in a single year.
x = 1981
print('Predicted life expectancy for {}: {:.2f} years'.format(x, np.polyval(coef, x)))
Gives: Predicted life expectancy for 1981: 72.40 years
Hopefully this is correct usage, and I learnt something answering this :)
Upvotes: 3