Reputation: 923
I have a data set that I would like to reshape part of the results. The data set always starts with the first few columns and is followed by a variable number of columns that group the data. If the key belongs to that group, it will be marked by an x. Each key might belong to multiple groups. It could also be empty. The data structure is like this:
Key Date Added Group1Name Group2Name Group3Name ... GroupXName
1 1/1/2018 x X
2 1/1/2018 x
3 1/1/2018
4 1/1/2018 x
5 1/1/2018 x
I want to reformat as:
Key Date Added Group
1 1/1/2018 Group1Name,Group2Name
2 1/1/2018 Group2Name
3 1/1/2018
4 1/1/2018 Group1Name
5 1/1/2018 GroupXName
Upvotes: 1
Views: 289
Reputation: 430
Use apply
with axis=1
param:
def group_func(series):
values = []
for val, idx in zip(series, series.index.values):
if val is 'x':
values += [str(idx)]
return " ".join(values)
cols_to_agg = ['Group1Name', 'Group2Name', 'Group3Name', 'Group4Name']
df.loc[:,'Group'] = df.loc[:,cols_to_agg].apply(group_func, axis=1)
Upvotes: 1
Reputation: 1606
Seems like you havent tried much and it's hard to really reproduce your data with what you provided but the idea is to have the columns have the proper values instead of 'x' and to take the dataframe from wide to long format...
columns_to_consider = ['Group1Name', 'Group2Name', ... ]
for column in columns_to_consider:
df[column] = df[column].str.replace('X', column)
reshaped_df = pd.melt(df, id_vars=['Key', 'Date Added'], value_vars=columns_to_consider)
Upvotes: 1