viktor_dmitry
viktor_dmitry

Reputation: 103

How to parse excel data and modify it into needed dataframe?

I have an excel like below: enter image description here

With it like a dataframe:

    ItemName     Category  Quantity
0   ProductA  Electronics       NaN
1          A  Electronics       1.0
2          B  Electronics       2.0
3          C  Electronics       3.0
4          D  Electronics       4.0
5          E  Electronics       5.0
6          F  Electronics       6.0
7   ProductB     Hardware       NaN
8          G     Hardware       7.0
9          H     Hardware       8.0
10         I     Hardware       9.0
11         J     Hardware      10.0
12  ProductC     Software       NaN
13         K     Software      11.0
14         L     Software      12.0

The first item name of any category is the item class. I need to separate this item class (ProductA, ProductB and ProductC) from the data, so that when I call df['ProductA'] I can print all the related information to this item name. (A,B,C,D,E and F)

So far I have tried this to extract the productA, B and C from the data.

import pandas as pd
import numpy as np
df = pd.read_excel(r'testing.xlsx')

index = df['Quantity'].index[df['Quantity'].apply(np.isnan)]
index_list=index.values.tolist()

for index in index_list:
    print(df['ItemName'][index])

But how can I modify the code so that I can call df['ProductA'] and get all the related info?

Upvotes: 0

Views: 102

Answers (1)

norie
norie

Reputation: 9857

You could use the following code to create a column named 'ProductClass'.

import pandas as pd
import numpy as np

df = pd.read_excel('Products.xlsx')

df['ProductClass'] = np.where(df['Quantity'].isna(), df['ItemName'], np.nan)

df['ProductClass']= df['ProductClass'].ffill()

df = df.dropna().reset_index().drop('index', axis=1)

You could then get all the rows for a product class like this.

print(df[df['ProductClass'] == 'ProductA'])
OUTPUT
  ItemName     Category  Quantity ProductClass
0        A  Electronics       1.0     ProductA
1        B  Electronics       2.0     ProductA
2        C  Electronics       3.0     ProductA
3        D  Electronics       4.0     ProductA
4        E  Electronics       5.0     ProductA
5        F  Electronics       6.0     ProductA

Upvotes: 1

Related Questions