user9347880
user9347880

Reputation: 123

How to predict a value using pandas data frame?

I need to use the line of best to predict a value in my datas frame. How would I do this? Is there a function that for example, I can input a year and be given a predicted value for life expectancy?

Year    Life Expectancy
1930    59.7
1940    62.9
1950    70.2
1965    67.7

How would I calculate a value for the year 1948?

Upvotes: 1

Views: 4681

Answers (2)

Awaldeep Singh
Awaldeep Singh

Reputation: 140

You can use :

import seaborn as sns    
sns.lmplot(data['Year'],data['Life Expectancy'],data)

This would fit a straight line for your given data according to linear regression and you could also figure out any other values such as for year 1948 etc.

For documentation refer : https://seaborn.pydata.org/generated/seaborn.lmplot.html

Upvotes: 2

smj
smj

Reputation: 1284

As I had a bit of time, for fun a complete example based on @ALollz comment, using numpy.polyfit() and .polyval().

% matplotlib inline

import pandas as pd
import numpy as np

# Generate some test data with a trend.

data = pd.DataFrame(
    {
        'year': list(range(1900, 2000)),
        'life_exp': np.linspace(50, 80, 100) * ((np.random.randn(100, ) * 0.1) + 1)
    }
)

data[['life_exp']].plot()

Giving:

enter image description here

# Fit coefficents.

coef = np.polyfit(data['year'], data['life_exp'], 1)

# Generate predictions for entire series.

data['predicted'] = pd.Series(np.polyval(coef, data['year']))

data[['life_exp', 'predicted']].plot()

Which gives us the result we want:

enter image description here

And we can predict a single year:

# Passing in a single year.

x = 1981

print('Predicted life expectancy for {}: {:.2f} years'.format(x, np.polyval(coef, x)))

Gives: Predicted life expectancy for 1981: 72.40 years

Hopefully this is correct usage, and I learnt something answering this :)

Upvotes: 3

Related Questions