moys
moys

Reputation: 8033

co-relation values based on column & row values in Pandas

I have a dataframe as below

Prs 10  20  30  40  50
5   40  40  40  40  40
10  100 100 100 100 100
15  150 150 150 150 150
25  256 256 260 264 268
40  291 291 293 296 300

First column is the pressure and rest of the columns are temperature. The values in the tables are `speed'. The ways we read this table is for a pressure of 5 & a temperature of 10, speed is 40. Similarly,for a pressure of 25 & a temperature of 40, the speed is 264.

However, i want to know how i can get the speed for a pressure & temperature that are not explicitly in the table but inside the range mentioned in the table. For example what would be the speed for pressure 12.6 & temprature 21.2? How do i do that? I can try & extrapolate the pressure column with intervals of 0.1 and the temperature also by 0.1 and then fill the values, but that makes this table too cumbersome & complex.

Is there any other way to do it? does co-relation function of Pandas come-in handy here? please guide. Note: In the complete table some time the speed decreases as well between intervals for example . For example when prs=90 & temp=75, the speed is 515, but when prs=90 & temp=85, speed is 480.

Upvotes: 1

Views: 59

Answers (1)

Serge Ballesta
Serge Ballesta

Reputation: 148870

This is a 2D interpolation. Neither Pandas nor numpy alone after methods for that. If it is worth it, you can install the full scipy package and then use scipy.interpolate.interp2d:

val = scipy.interpolate.interp2d(df['Prs'].values, df.columns[1:].astype('int').value,
                                 df.iloc[:, 1:].values)(12.6, 21.2)

This would use a linear interpolation, but scipy provides for various spline methods, just look at the doc


If installing the full scipy is not an option, you can first interpolate the columns by hand at the pressure value and then interpolate the resulting array at the temp value:

def interp(x,y):
    it = [np.interp(x, df['Prs'], df.iloc[:, i]) for i in range(1, len(df.columns))]
    return np.interp(y, df.columns[1:].astype('float'), it)


interp(12.6, 21.2)

returns as expected 126.0

Upvotes: 1

Related Questions