Joyce Hong
Joyce Hong

Reputation: 15

Split a column of string then group by the index and then add them up?

so here is what I have in my dataframe

+--------+------------------+---------------------+--+
| Userid |      Country     |        food         |  |
+--------+------------------+---------------------+--+
|   1001 |  United States   | cheese;burger;pizza |  |
|   1002 | United States    | burger;pizza        |  |
|   1003 | Canada           | pizza;noodles       |  |
+--------+------------------+---------------------+--+

I need to make them into this way, i tried different ways,like making them into dictionary,but not works...

+---------------+--------+--------+-------+---------+
|    Country    | Cheese | Burger | pizza | noodles |
+---------------+--------+--------+-------+---------+
| United States |      1 |      2 |     2 |       0 |
| Canada        |      0 |      0 |     1 |       1 |
+---------------+--------+--------+-------+---------+

Upvotes: 0

Views: 23

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Try this using the .str, string accessor, and get_dummies, then groupby 'Country' column and sum:

df['food'].str.get_dummies(';').groupby(df['Country']).sum()

Output:

               burger  cheese  noodles  pizza
Country                                      
Canada              0       0        1      1
United States       2       1        0      2

With reset_index:

df['food'].str.get_dummies(';').groupby(df['Country']).sum().reset_index()

Output:

         Country  burger  cheese  noodles  pizza
0         Canada       0       0        1      1
1  United States       2       1        0      2

Upvotes: 1

Related Questions