Prachi
Prachi

Reputation: 75

How can I split multiple columns of a df into multiple rows?

I have recently started learning python. My df looks like:

**FRUITS**            **QUANTITY**        **MARKET_PRICE**             **SELLING_PRICE**
Apple                      5                    65                            50
Kiwi                       8                    80                            65
Orange                     6                    70                            55
Apple/Kiwi/Orange          9                65/80/70                       50/65/55
Apple                      4                    50                            40         
Orange                     7                    65                            45

i want to my df to look like:

**FRUITS**            **QUANTITY**        **MARKET_PRICE**             **SELLING_PRICE**
Apple                      5                    65                            50
Kiwi                       8                    80                            65
Orange                     6                    70                            55
Apple                      9                    65                            50
Kiwi                       9                    80                            65
Orange                     9                    70                            55
Apple                      4                    50                            40         
Orange                     7                    65                            45

how do i achieve this?

Upvotes: 3

Views: 57

Answers (1)

sammywemmy
sammywemmy

Reputation: 28669

#convert to lists
res = df.astype('str').to_numpy().tolist()

#split each entry using "/"
res1 = [[val.split("/") for val in ent]
        for ent in res]


#get the maximum length for each sublist
lengths = [max(len(v) for v in ent) for ent in res1]

#if the length of a value in a sublist is 
#less than the maximum length,
# increase its length
res2 = [[v*length 
         if len(v) < length else v 
         for v in ent] 
        for ent,length 
        in zip(res1,lengths)
       ]

#zip through each sublist and chain the final outcome
final = chain.from_iterable(zip(*ent) for ent in res2)

#read back into dataframe
output = pd.DataFrame(final,columns=df.columns)
output


**FRUITS**  **QUANTITY**    **MARKET_PRICE**    **SELLING_PRICE**
0   Apple       5                 65                  50
1   Kiwi        8                 80                  65
2   Orange      6                 70                  55
3   Apple       9                 65                  50
4   Kiwi        9                 80                  65
5   Orange      9                 70                  55
6   Apple       4                 50                  40
7   Orange      7                 65                  45

Upvotes: 1

Related Questions