How to make a new dataframe from existig dataframe by averaging out some of the columns

Question

I have a dataframe which has columns -

cols = group_dataframe.columns
print(cols)

Index(['TEST_TXT', 'count', 'mean', 'std', 'LSL', 'USL', 'median', 'Cp', 'CpK', 'Cpu', 'Cpl', 'min', 'max', '25%',
       '50%', '75%'],
      dtype='object')

I wish to make a new dataframe with mean of all the rows for certain columns like "mean","std","Cp","Cpu" but minimum and maximum for "min" and "max" column, also leave test_txt from processing.

My code looks like this -

new_df = pd.DataFrame()
new_df["Group"] = np.asarray(test_group_name)

for col in cols:
    if col == "TEST_TXT":
        pass
    elif col in ["min","max"]:
        new_df[col] = np.min(group_dataframe[col].astype(float))
    else:
        new_df[col] = np.mean(group_dataframe[col].astype(float))

but this doesn't seem to fill dataframe at all. The new dataframe should have only one row, mean of values for a certain column and min/max for others. Can anyone help to find the error(if there is any), or suggest something better to achieve the same?

piterbarg · Accepted Answer

I would first create a dictionary with the averages and then convert it into a DataFrame

res = {}
for col in cols:
    if col == "TEST_TXT":
        pass
    elif col in ["min","max"]:
        res[col] = np.min(group_dataframe[col].astype(float))
    else:
        res[col] = np.mean(group_dataframe[col].astype(float))

new_df = pd.DataFrame(res)

How to make a new dataframe from existig dataframe by averaging out some of the columns

Answers (2)

Related Questions