Reputation: 103
With it like a dataframe:
ItemName Category Quantity
0 ProductA Electronics NaN
1 A Electronics 1.0
2 B Electronics 2.0
3 C Electronics 3.0
4 D Electronics 4.0
5 E Electronics 5.0
6 F Electronics 6.0
7 ProductB Hardware NaN
8 G Hardware 7.0
9 H Hardware 8.0
10 I Hardware 9.0
11 J Hardware 10.0
12 ProductC Software NaN
13 K Software 11.0
14 L Software 12.0
The first item name of any category is the item class. I need to separate this item class (ProductA, ProductB and ProductC) from the data, so that when I call df['ProductA'] I can print all the related information to this item name. (A,B,C,D,E and F)
So far I have tried this to extract the productA, B and C from the data.
import pandas as pd
import numpy as np
df = pd.read_excel(r'testing.xlsx')
index = df['Quantity'].index[df['Quantity'].apply(np.isnan)]
index_list=index.values.tolist()
for index in index_list:
print(df['ItemName'][index])
But how can I modify the code so that I can call df['ProductA'] and get all the related info?
Upvotes: 0
Views: 102
Reputation: 9857
You could use the following code to create a column named 'ProductClass'.
import pandas as pd
import numpy as np
df = pd.read_excel('Products.xlsx')
df['ProductClass'] = np.where(df['Quantity'].isna(), df['ItemName'], np.nan)
df['ProductClass']= df['ProductClass'].ffill()
df = df.dropna().reset_index().drop('index', axis=1)
You could then get all the rows for a product class like this.
print(df[df['ProductClass'] == 'ProductA'])
OUTPUT
ItemName Category Quantity ProductClass
0 A Electronics 1.0 ProductA
1 B Electronics 2.0 ProductA
2 C Electronics 3.0 ProductA
3 D Electronics 4.0 ProductA
4 E Electronics 5.0 ProductA
5 F Electronics 6.0 ProductA
Upvotes: 1