jia Jimmy
jia Jimmy

Reputation: 1848

Use pandas generate a excel file

Say my data structure like :

data = {
        "ADVERTISING_AND_MARKETING":[
            "CREATIVE_AGENCY",
            "FULL_SERVICE_AGENCY",
            "BUYING_AGENCY",
 
        ],
        "AGRICULTURE":[
            "FARMING_AND_RANCHING",
            "FISHING_AND_HUNTING_AND_FORESTRY_AND_LOGGING"
        ],
        "AUTOMOTIVE":[
            "T1_MOTORCYCLE",
            "RECREATIONAL",
            "T1_AUTOMOTIVE_MANUFACTURER",
            "T2_DEALER_ASSOCIATIONS",
        ]
}

How can I generate a excel like :


vetical                           subvertical        
 
ADVERTISING_AND_MARKETING         CREATIVE_AGENCY
                                  FULL_SERVICE_AGENCY
                                  BUYING_AGENCY
                                  

                                  
AGRICULTURE                       FARMING_AND_RANCHING
                                  FISHING_AND_HUNTING_AND_FORESTRY_AND_LOGGING
                                  
AUTOMOTIVE                        T1_MOTORCYCLE
                                  RECREATIONAL
                                  T1_AUTOMOTIVE_MANUFACTURER 
                                  T2_DEALER_ASSOCIATIONS

Sorry I'm really not good at pandas, and hope your help.

Thanks

Upvotes: 0

Views: 84

Answers (2)

Aditya
Aditya

Reputation: 1377

When you do to_csv, the raw data is sent as the output to the file so in order to remove those duplicate values, you can do something like this. A bit hacky but works

df = pd.json_normalize(data = {
        "ADVERTISING_AND_MARKETING":[
            "CREATIVE_AGENCY",
            "FULL_SERVICE_AGENCY",
            "BUYING_AGENCY",
 
        ],
        "AGRICULTURE":[
            "FARMING_AND_RANCHING",
            "FISHING_AND_HUNTING_AND_FORESTRY_AND_LOGGING"
        ],
        "AUTOMOTIVE":[
            "T1_MOTORCYCLE",
            "RECREATIONAL",
            "T1_AUTOMOTIVE_MANUFACTURER",
            "T2_DEALER_ASSOCIATIONS",
        ]
})

df = df.T.explode(0).reset_index()   # Transpose and explode the list to multiple rows
df.loc[df['index'].duplicated(), 'index'] = '' # Find all duplicates in the column and replace them with empty strings
df.columns = ['verticals', 'subverticals']  # Assign/Rename the columns names
df.to_csv('output.csv', index=False)

Upvotes: 1

Nk03
Nk03

Reputation: 14949

Do you want this?

df  = pd.json_normalize(data).transpose().explode(0).reset_index().rename(columns={'index': 'vertical', 0 :'subvertical'})
df.to_excel('out.xlsx', index=False)

Output-

                    vertical                                   subvertical
0  ADVERTISING_AND_MARKETING                               CREATIVE_AGENCY
1  ADVERTISING_AND_MARKETING                           FULL_SERVICE_AGENCY
2  ADVERTISING_AND_MARKETING                                 BUYING_AGENCY
3                AGRICULTURE                          FARMING_AND_RANCHING
4                AGRICULTURE  FISHING_AND_HUNTING_AND_FORESTRY_AND_LOGGING
5                 AUTOMOTIVE                                 T1_MOTORCYCLE
6                 AUTOMOTIVE                                  RECREATIONAL
7                 AUTOMOTIVE                    T1_AUTOMOTIVE_MANUFACTURER
8                 AUTOMOTIVE                        T2_DEALER_ASSOCIATIONS

Upvotes: 1

Related Questions