WDS
WDS

Reputation: 337

Linear Regression on each column without creating for loops or functions

Applying regression on each of the columns or rows in a pandas dataframe, without using for loops.

There is a similar post about this; Apply formula across pandas rows/ regression line, that does a regression for each of the "rows," however plotting the answer given is wrong. I couldn't comment on it as i do not have enough reputation, the main problem with that is that, it takes the values of the columns but then uses the apply function on each row.

Currently I only know how to do each column eg.

np.random.seed(1997)

df = pd.DataFrame(np.random.randn(10, 4))
first_stats = scipy.stats.linregress(df.index,df[0])
second_stats = scipy.stats.linregress(df.index,df[1])

I was hoping to find an answer without creating a function or for loops, similar to; pandas df.sum(), but instead of sum i want to do a regression that results in slope, intercept, r-value, p-value and standard error.

Upvotes: 4

Views: 3532

Answers (1)

bubble
bubble

Reputation: 1672

Look at the following example:

import numpy as np
import pandas as pd
from scipy.stats import linregress

np.random.seed(1997)
df = pd.DataFrame(pd.np.random.rand(100, 10))

df.apply(lambda x: linregress(df.index, x), result_type='expand').rename(index={0: 'slope', 1: 
                                                                                  'intercept', 2: 'rvalue', 3:
                                                                                  'p-value', 4:'stderr'})

It should return what you want.

Upvotes: 5

Related Questions