Martina Lazzarin
Martina Lazzarin

Reputation: 169

How to split a tuple into multiple columns in dictionary

I attach my code, current wrong output, and correct output example. Thank you in advance

new_df = pd.DataFrame(columns=['hr', 'filename', 'a', 'b','c'])
new_df.columns = ['hr', 'filename', 'a', 'b','c']

################### curve fitting ########################################

grouped_df = HL.groupby(["hr", "filename"]) ## this is my initial dataframe

for key, g in grouped_df:

    a = g['NPQ'].max()
    b = g['NPQ'].min()
    c = 0.36

    popt, pcov = curve_fit(model, g['time'], g['NPQ'], p0 = np.array([a, b, c]), absolute_sigma=True)

    print('Estimated parameters: \n', popt))

    ##################### new data frame
    new_row = {'hr': key, 'a':a, 'b':b, 'c':c }
    new_df = new_df.append(new_row, ignore_index=True)

    print(new_df)

This is the wrong output enter image description here

An example of the correct output (I simplified it for efficiency):

hr      filename      a      b     c 
8     20191129.0     21.22  0.55  0.45
8     20191129.0      ..     ..    ..
8     20191129.0      ..     ..    ..
14.0  20191129.0      ..     ..    ..

Upvotes: 0

Views: 493

Answers (1)

Trenton McKinney
Trenton McKinney

Reputation: 62393

  • Extract the keys as k1 and k2 instead of key, because you do .groupby on two columns.
  • Then create new_row with 'hr' as k1, and 'filename' as k2
  • Instead of assigning the returned values to popt, you can assign them to (x, y, z).
for (k1, k2), g in df.groupby(["hr", "filename"]):
    ...
    (x, y, z), pcov = curve_fit(model, g['time'], g['NPQ'], p0=np.array([a, b, c]), absolute_sigma=True)
    ...
    new_row = {'hr': k1, 'filename': k2, 'a': x, 'b': y, 'c': z}
    new_df = new_df.append(new_row, ignore_index=True)
  • Alternatively, key is a tuple because the .groupby is on more than one column, so you can extract the separate values by calling the appropriate index.
    • Create new_row with 'hr' as key[0], and 'filename' as key[1]
  • If popt is a list or tuple, then you can assign the appropriate indices to 'a', 'b', and 'c'.
for key, g in df.groupby(["hr", "filename"]):
    ...
    popt, pcov = curve_fit(model, g['time'], g['NPQ'], p0 = np.array([a, b, c]), absolute_sigma=True)
    ...
    new_row = {'hr': key[0], 'filename': key[1], 'a': popt[0], 'b': popt[1], 'c': popt[2]}
    new_df = new_df.append(new_row, ignore_index=True)

Upvotes: 1

Related Questions