when grouping dataframe join the values that are different for a certain column

Question

I'm first going to show with an example what I mean:

Let's suppose that I have this dataframe:

If I group the dataframe without the penultimate column (column_2), I want to end up with this:

And, if I have a dataframe where column_1 and last_column have the same values. I don't need to "join" or "append" the values in the "column_2", I just want an empty dataframe.

Does that make sense what I mean? What I've just have is this:

import pandas as pd

data = {'column_1': ['no', 'no', 'no', 'no'], 'column_2': ['spain', 'france', 'italy', 'germany'], "last_column": ['A', 'A', 'A', 'B']}

df = pd.DataFrame.from_dict(data)

aux = df.drop(columns = ['column_2'])
indices_to_keep = aux.groupby(aux.columns.to_list()).filter(lambda x : len(x)<2).index
df_to_keep = df.filter(items = indices_to_keep.to_list(), axis = 0)

My problem with this code, is that I don't know how to join the values on a single row when the df is being grouped.

when grouping dataframe join the values that are different for a certain column

Answers (1)

Related Questions