Reputation: 155
I am trying to convert a one hot key dataframe into a 2 d frame
Is there anyways I can iterate over rows and columns and fill the values having a 1
with the column name.
problem dataframe:
+------------------+-----+-----+
| sentence | lor | sor |
+------------------+-----+-----+
| sam lived here | 0 | 1 |
+------------------+-----+-----+
| drack lived here | 1 | 0 |
+------------------+-----+-----+
Solution dataframe:
+------------------+------+
| sentence | tags |
+------------------+------+
| sam lived here | sor |
+------------------+------+
| drack lived here | lor |
+------------------+------+
Upvotes: 1
Views: 102
Reputation: 1086
You can segregate the rows having 1 for every column. For these columns, replace the value 1 with the name specified along with renaming the column names
lor_df = df.loc[df["lor"].eq(1), "lor"].rename(columns={"lor": "tags"}).replace(1, "lor")
sor_df = df.loc[df["sor"].eq(1), "sor"].rename(columns={"sor": "tags"}).replace(1, "sor")
After this, concatenate the individual results using pandas.concat, followed by dropping the columns which aren't required.
df["tags"] = pd.concat([lor_df, sor_df], sort=False)
df.drop(columns=["lor", "sor"], inplace=True)
To ensure unique values we can use pandas.DataFrame.drop_duplicates
df.drop_duplicates(inplace=True)
print(df)
Upvotes: 1