datascientist110
datascientist110

Reputation: 75

How to create Pivot Table with Python Dataframe based on columns's substring values and counts?

Dataset:


Item_Identifier  Item_Weight Item_Fat_Content  Item_Visibility  \
0           FDA15         9.30          Low Fat         0.016047   
1           DRC01         5.92          Regular         0.019278   
2           FDN15        17.50          Low Fat         0.016760   
3           FDX07        19.20          Regular         0.065953   
4           NCD19         8.93          Low Fat         0.065953   

               Item_Type  Item_MRP Outlet_Identifier  \
0                  Dairy  249.8092            OUT049   
1            Soft Drinks   48.2692            OUT018   
2                   Meat  141.6180            OUT049   
3  Fruits and Vegetables  182.0950            OUT010   
4              Household   53.8614            OUT013   

   Outlet_Establishment_Year Outlet_Size Outlet_Location_Type  \
0                       1999      Medium               Tier 1   
1                       2009      Medium               Tier 3   
2                       1999      Medium               Tier 1   
3                       1998      Medium               Tier 3   
4                       1987        High               Tier 3   

         Outlet_Type   Item_Type_new  
0  Supermarket Type1      perishable  
1  Supermarket Type2  non-perishable  
2  Supermarket Type1      perishable  
3      Grocery Store      perishable  
4  Supermarket Type1  non-perishable  

Pivotal Table:
Index: Item_Type, Columns: Substring of Item Identifiers, Values: counts.

Expected Output:

                          DR   FD   NC
  Baking Goods             0 1086    0
  Breads                   0  416    0
  Breakfast                0  186    0
  Canned                   0 1084    0
  Dairy                  229  907    0
  Frozen Foods             0 1426    0
  Fruits and Vegetables    0 2013    0
  Hard Drinks            362    0    0
  Health and Hygiene       0    0  858
  Household                0    0 1548
  Meat                     0  736    0
  Others                   0    0  280
  Seafood                  0   89    0
  Snack Foods              0 1989    0
  Soft Drinks            726    0    0
  Starchy Foods            0  269    0

Upvotes: 0

Views: 233

Answers (1)

Yong Wang
Yong Wang

Reputation: 1313

Create a new columns which is sub-string of item Item_Identifier. and then create pivot_table based on them.

Here is the code. (assuming the df is the dataframe with dataset)

df['Item_Identifier_substr'] = df['Item_Identifier'].str.left(2)
pivot_df = df.pivot_table(index = 'Item_Type', columns = 'Item_Identifier_substr', values='Item_Identifier', aggfunc='count')

pivot_df

If you like it, pls vote my answer.

Upvotes: 1

Related Questions