Reputation: 382
While I find help and documentation on how to convert a pandas DataFrame to dictionary so that columns are keys and values are rows, I find myself stuck when I would like to have one of the column's values as keys and the associated values from another column as values, so that a df like this
a b
1 car
1 train
2 boot
2 computer
2 lipstick
converts to the following dictionary {'1': ['car','train'], '2': ['boot','computer','lipstick]}
I have a feeling it's something pretty simple but I'm out of ideas. I tried df.groupby('a').to_dict()
but was unsuccessful
Any suggestions?
Upvotes: 1
Views: 1395
Reputation: 880339
You could view this as a groupby-aggregation (i.e., an operation which turns each group into one value -- in this case a list):
In [85]: df.groupby(['a'])['b'].agg(lambda grp: list(grp))
Out[85]:
a
1 [car, train]
2 [boot, computer, lipstick]
dtype: object
In [68]: df.groupby(['a'])['b'].agg(lambda grp: list(grp)).to_dict()
Out[68]: {1: ['car', 'train'], 2: ['boot', 'computer', 'lipstick']}
Upvotes: 2
Reputation: 20563
Yes, because DataFrameGroupBy
has no attribute of to_dict
, only DataFrame
has to_dict
attribute.
DataFrame.to_dict(outtype='dict') Convert DataFrame to dictionary.
You can read more about DataFrame.to_dict
here
Take a look of this:
import pandas as pd
df = pd.DataFrame([np.random.sample(9), np.random.sample(9)])
df.columns = [c for c in 'abcdefghi']
# it will convert the DataFrame to dict, with {column -> {index -> value}}
df.to_dict()
{'a': {0: 0.53252618404947039, 1: 0.78237275521385163},
'b': {0: 0.43681232450879315, 1: 0.31356312459390356},
'c': {0: 0.84648298651737541, 1: 0.81417040486070058},
'd': {0: 0.48419015448536995, 1: 0.37578177386187273},
'e': {0: 0.39840348154035421, 1: 0.35367537180764919},
'f': {0: 0.050381560155985827, 1: 0.57080653289506755},
'g': {0: 0.96491634442628171, 1: 0.32844653606404517},
'h': {0: 0.68201236712813085, 1: 0.0097104037581828839},
'i': {0: 0.66836630467152902, 1: 0.69104505886376366}}
type(df)
pandas.core.frame.DataFrame
# DataFrame.groupby is another type
type(df.groupby('a'))
pandas.core.groupby.DataFrameGroupBy
df.groupby('a').to_dict()
AttributeError: Cannot access callable attribute 'to_dict' of 'DataFrameGroupBy' objects, try using the 'apply' method
Upvotes: 1
Reputation: 5659
You can't perform a to_dict()
on a the result of groupby, but you can use it to perform your own dictionary construction. The following code will work with the example you provided.
import pandas as pd
df = pd.DataFrame(dict(a=[1,1,2,2,2],
b=['car', 'train', 'boot', 'computer', 'lipstick']))
# Using a loop
dt = {}
for g, d in df.groupby('a'):
dt[g] = d['b'].values
# Using dictionary comprehension
dt2 = {g: d['b'].values for g, d in df.groupby('a')}
Now both dt
and dt2
will be dictionaries like this:
{1: array(['car', 'train'], dtype=object),
2: array(['boot', 'computer', 'lipstick'], dtype=object)}
Of course you can put the numpy arrays back into lists, if you so desire.
Upvotes: 1