Apply Plyfit Function to find the slope for each dataframe column

Question

I am trying to apply the following function to calculate the slope and intercept for each dataframe column:

from scipy.stats import linregress
def fit_line(x, y):
     """Return slope, intercept of best fit line."""
     # Remove entries where either x or y is NaN.
     clean_data = pd.concat([x, y], 1).dropna(0) # row-wise
     (_, x), (_, y) = clean_data.iteritems()
     slope, intercept, r, p, stderr = linregress(x, y)
     return slope, intercept

I create a new dataframe with two columns, however, I don't really know how to pass the first columns as (x) and other columns as y?

df['m'], df['b']  = df_freq.apply(fit_line(x?, y?), axis=1)

here are the columns for the dataframe all data are floats.

Index(['Time', '5', '10', '15', '20', '25', '30', '35', '40', '45', '50', '55', '60', '65', '70', '75', '80', '85', '90', '95', '100', '105', '110', '115', '120', '125', '130', '135', '140', '145', '150', '155', '160', '165', '170', '175', '180', '185', '190', '195', '200', '205', '210', '215', '220', '225', '230', '235', '240', '245'], dtype='object')

P.Tillmann · Accepted Answer

Edited: Sorry i missread your question.

Edited 2: Taken into account that append is not inplace by default

I think it will be the easiest to use a for loop for want you want to achieve. Assuming you have different columns with y-values and the index as x-values:

df_fit_parameter = pd.DataFrame()
for column in df_freq.columns:
  df_lin_fit = df_freq[column].dropna()
  slope, intercept, r, p, stderr = linregress(df_lin_fit.index, df_lin_fit)
  df_fit_parameter = df_fit_parameter.append(pd.DataFrame({'m':slope,'b':intercept}, index=[column]))

Apply Plyfit Function to find the slope for each dataframe column

Answers (1)

Related Questions