m2rik
m2rik

Reputation: 135

Create Dummies for Multiple Columns on Unique Value in a Column

I have a Dataframe as mentioned below, I have multiple categories for CTI and RESOLUTION and the goal is to create dummy variables for the CTI and RESOLUTION Categories and for the categories that do not have an entry for this specific account.

       ACCOUNT  |   CTI       |      RESOLUTION
        59737001    Data:HI         Customer Owned Issue / Customer Equipment
        59737001    Data:HI         Repaired / Replaced Drop Underground
        13847688    Data:OK         Not Repaired

My expected output is

    ACCOUNT  |  CTI_Data:HI | CTI_DATA:OK| RESOLUTION_Customer Owned... | RESOLUTION_Repaired/Repla.... | RESOLUTION_Not Repaired
     59737001      1         0                  1                          1                         0

I know pd.get_dummies() works for getting the dummies for multiple categories but my case is different. Any help is appreciated

Upvotes: 0

Views: 708

Answers (1)

scotscotmcc
scotscotmcc

Reputation: 3113

I believe you can get this by using both pd.get_dummies() and df.groupby().any(). The groupby().any() will return TRUE/FALSE, and so you end that with converting to int

df2 = pd.get_dummies(df,columns=['CTI','RESOLUTION']) # df is what you have in your first example. Putting in the columns here restricts dummies to just those columns.
df2.groupby('ACCOUNT').any().astype(int)

Upvotes: 1

Related Questions