Reputation: 157
I am trying to apply the following function to calculate the slope and intercept for each dataframe column:
from scipy.stats import linregress
def fit_line(x, y):
"""Return slope, intercept of best fit line."""
# Remove entries where either x or y is NaN.
clean_data = pd.concat([x, y], 1).dropna(0) # row-wise
(_, x), (_, y) = clean_data.iteritems()
slope, intercept, r, p, stderr = linregress(x, y)
return slope, intercept
I create a new dataframe with two columns, however, I don't really know how to pass the first columns as (x) and other columns as y?
df['m'], df['b'] = df_freq.apply(fit_line(x?, y?), axis=1)
here are the columns for the dataframe all data are floats.
Index(['Time', '5', '10', '15', '20', '25', '30', '35', '40', '45', '50', '55', '60', '65', '70', '75', '80', '85', '90', '95', '100', '105', '110', '115', '120', '125', '130', '135', '140', '145', '150', '155', '160', '165', '170', '175', '180', '185', '190', '195', '200', '205', '210', '215', '220', '225', '230', '235', '240', '245'], dtype='object')
Upvotes: 2
Views: 1798
Reputation: 2110
Edited: Sorry i missread your question.
Edited 2: Taken into account that append is not inplace by default
I think it will be the easiest to use a for loop for want you want to achieve. Assuming you have different columns with y-values and the index as x-values:
df_fit_parameter = pd.DataFrame()
for column in df_freq.columns:
df_lin_fit = df_freq[column].dropna()
slope, intercept, r, p, stderr = linregress(df_lin_fit.index, df_lin_fit)
df_fit_parameter = df_fit_parameter.append(pd.DataFrame({'m':slope,'b':intercept}, index=[column]))
Upvotes: 2