Pandas reshaping data from Multiple columns into a single Column

Question

I have a data set that I would like to reshape part of the results. The data set always starts with the first few columns and is followed by a variable number of columns that group the data. If the key belongs to that group, it will be marked by an x. Each key might belong to multiple groups. It could also be empty. The data structure is like this:

Key  Date Added Group1Name Group2Name Group3Name ... GroupXName
1    1/1/2018   x           X
2    1/1/2018               x
3    1/1/2018                          
4    1/1/2018   x 
5    1/1/2018                                         x

I want to reformat as:

Key  Date Added Group
1    1/1/2018   Group1Name,Group2Name
2    1/1/2018   Group2Name           
3    1/1/2018        
4    1/1/2018   Group1Name
5    1/1/2018   GroupXName

okk · Accepted Answer

Use apply with axis=1 param:

def group_func(series):
        values = []
        for val, idx in zip(series, series.index.values):
            if val is 'x':
                values += [str(idx)]
        return " ".join(values)

cols_to_agg = ['Group1Name', 'Group2Name', 'Group3Name', 'Group4Name']
df.loc[:,'Group'] = df.loc[:,cols_to_agg].apply(group_func, axis=1)

Pandas reshaping data from Multiple columns into a single Column

Answers (2)

Related Questions