Reputation: 169
I would like to get a final dataframe in which the tuple 'key'
is split into two columns, 'hr'
and 'filename'
, respectively.
I also would like the output of the fit 'a, b, c'= *popt to be split into the three columns a, b, c.
In the current output dataframe, the last three columns do not contain the correct values. They show the initial a, b, c values, which are the initial guess of the fit. They should instead show the output of the fit ( *popt).
I attach my code, current wrong output, and correct output example. Thank you in advance
new_df = pd.DataFrame(columns=['hr', 'filename', 'a', 'b','c'])
new_df.columns = ['hr', 'filename', 'a', 'b','c']
################### curve fitting ########################################
grouped_df = HL.groupby(["hr", "filename"]) ## this is my initial dataframe
for key, g in grouped_df:
a = g['NPQ'].max()
b = g['NPQ'].min()
c = 0.36
popt, pcov = curve_fit(model, g['time'], g['NPQ'], p0 = np.array([a, b, c]), absolute_sigma=True)
print('Estimated parameters: \n', popt))
##################### new data frame
new_row = {'hr': key, 'a':a, 'b':b, 'c':c }
new_df = new_df.append(new_row, ignore_index=True)
print(new_df)
An example of the correct output (I simplified it for efficiency):
hr filename a b c
8 20191129.0 21.22 0.55 0.45
8 20191129.0 .. .. ..
8 20191129.0 .. .. ..
14.0 20191129.0 .. .. ..
Upvotes: 0
Views: 493
Reputation: 62393
k1
and k2
instead of key
, because you do .groupby
on two columns.new_row
with 'hr'
as k1
, and 'filename'
as k2
popt
, you can assign them to (x, y, z)
.for (k1, k2), g in df.groupby(["hr", "filename"]):
...
(x, y, z), pcov = curve_fit(model, g['time'], g['NPQ'], p0=np.array([a, b, c]), absolute_sigma=True)
...
new_row = {'hr': k1, 'filename': k2, 'a': x, 'b': y, 'c': z}
new_df = new_df.append(new_row, ignore_index=True)
key
is a tuple
because the .groupby
is on more than one column, so you can extract the separate values by calling the appropriate index.
new_row
with 'hr'
as key[0]
, and 'filename'
as key[1]
popt
is a list
or tuple
, then you can assign the appropriate indices to 'a'
, 'b'
, and 'c'
.for key, g in df.groupby(["hr", "filename"]):
...
popt, pcov = curve_fit(model, g['time'], g['NPQ'], p0 = np.array([a, b, c]), absolute_sigma=True)
...
new_row = {'hr': key[0], 'filename': key[1], 'a': popt[0], 'b': popt[1], 'c': popt[2]}
new_df = new_df.append(new_row, ignore_index=True)
Upvotes: 1