Reputation: 95
I am working on creating a dataframe in python, and I am struggling on how to proceed with this problem. I am trying to analyze skincare ingredients, and I want to create a dataframe that I can regularly update with new products.
The product ingredients are stored as list items.
In the end, the dataframe should look like this:
product 1 = [ingredient 1, ingredient 2, ingredient 3]
ingredient 1 ingredient 2 ingredient 3
product 1 XYZ XYZ XYZ
product 2 XYZ XYZ XYZ
product 3 XYZ XYZ XYZ
The XYZ values being the items from the lists that I use. How should I tackle this? It is also important to maintain the order of the items/ingredients, and to make sure they will correctly be assigned as ingredient 1 being the first item in the list, ingredient 2 being the second item in the list, etc..
I have only found solutions where items were transformed into a list of different observations, not into column values for one single observation.
Is there a function or a method I can use to solve this issue?
Upvotes: 0
Views: 100
Reputation: 753
you can use panda data frame easily:
my_data = np.array([['','ingredient1', 'ingredient2', 'ingredient3'],['product1', 45, 35, 25], ['product2', 44,34,24], ['product3', 43,33,23]])
my_data
which gives you:
array([['', 'ingredient1', 'ingredient2', 'ingredient3'],
['product1', '45', '35', '25'],
['product2', '44', '34', '24'],
['product3', '43', '33', '23']], dtype='|S11')
Then you can make a dataframe based on your data:
df = pd.DataFrame(data=my_data[1:,1:],
index=my_data[1:,0],
columns=my_data[0,1:])
df
and produces:
Upvotes: 2