Reputation: 923
I have a data set that I would like to reshape part of the results. The data set always starts with the first few columns and is followed by a variable number of columns that group the data. If the key belongs to that group, it will be marked by an x. Each key will not belong to more than one group. The data structure is like this:
Key Date Added Group1Name Group2Name Group3Name ... GroupXName
1 1/1/2018 x
2 1/1/2018 x
3 1/1/2018 x
4 1/1/2018 x
5 1/1/2018 x
I want to reformat as:
Key Date Added Group
1 1/1/2018 Group1Name
2 1/1/2018 Group2Name
3 1/1/2018 Group3Name
4 1/1/2018 Group1Name
5 1/1/2018 GroupXName
Upvotes: 1
Views: 59
Reputation: 2939
We can first convert this into an easier to use binary format:
group_cols = df.columns.difference(["Key", "Date Added"])
df[group_cols] = df[group_cols].replace({"":0, "x":1})
Then get max id in axis 1 i.e. the column in which we have the highest value, which will be 1
df["Group"] = df[group_cols].idxmax(axis=1)
And drop the original columns
df = df.drop(group_cols, axis=1)
Upvotes: 3