Ilyes
Ilyes

Reputation: 53

Define a column type as 'list' in Pandas

I have a pandas data frame df. I want df to be composed of 3 columns : the first one is a brand name (a string), the second is a list of integers, and the third one is a list of floats. So for each brand, I have two lists, and I want to put them all in a data frame to access different lists easily based on the brand name.

I have :

count = [1,5,198,0,0,35]
brand = 'Nike'

and to put the count list into the 'count' column corresponding to 'Nike' line I tried the following :

df[df['brand']==brand].loc[0,'count'] = count
df[df['brand']==brand]['count'] == count
df[df['brand']==brand]['count'].loc[0] == count

None of these would work and I get ValueError: Must have equal len keys and value when setting with an iterable or A value is trying to be set on a copy of a slice from a DataFrame and nothing changes in df.

How can I write a list into a pandas data frame cell ?

Upvotes: 5

Views: 15177

Answers (3)

ysearka
ysearka

Reputation: 3855

You can use your brands as column names:

import pandas as pd

df = pd.DataFrame({'Nike' : [[1,5,198,0,0,35],[0.5,0.3,0.2]]},index = ['count','floats'])

and then you can add new brands like this:

df['Puma'] = [[1,2,3],[0.1,0.2]]

You will obtain this dataframe:

        Nike                    Puma
count   [1, 5, 198, 0, 0, 35]   [1, 2, 3]
float   [0.5, 0.3, 0.2]         [0.1, 0.2]

Then accessing the values is really simple.

Upvotes: 4

knagaev
knagaev

Reputation: 2957

It seems to me that you are building a wrong data model. The model is not in 1st normal form (1NF) and you will have many troubles using it. Please, try to use a normalized model.

   Brand     price
0  Nike     50.0
1  Nike     60.0
2  Nike     70.0
3  Puma     30.0
4  Puma     100.0

You can get any computed value from this model with ease.

Upvotes: 5

B. M.
B. M.

Reputation: 18638

You can create it like that. Type will be object.

In [254]: df=pd.DataFrame({'Brand':['Nike','Puma'],
'count':[[1,2,3],[0,0]],'price':[[50.]*3,[100.]*2]})

In [255]: df
Out[255]: 
  Brand      count               price
0  Nike  [1, 2, 3]  [50.0, 50.0, 50.0]
1  Puma     [0, 0]      [100.0, 100.0]

Upvotes: 0

Related Questions