Pi-R
Pi-R

Reputation: 654

pandas df to dict with groupby

I have this df :

line stop
1    1_a 
1    1_b 
1    1_c
2    2_a
2    2_c

I want to create the following dict :

d={1 : {"stops" : "1_a","1_b","1_c"}, 2 : {"stops" : "2_a","2_b","2_c"}}

Someone knows how to do that with to_dict method ?

Thanks !

Upvotes: 1

Views: 361

Answers (2)

jezrael
jezrael

Reputation: 862511

You can create nested dictionaries filled by lists by DataFrame.groupby with apply, then Series.to_frame and last DataFrame.to_dict:

d = df.groupby('line')['stop'].apply(list).to_frame().to_dict('index')
print (d)
{1: {'stop': ['1_a', '1_b', '1_c']}, 2: {'stop': ['2_a', '2_c']}}

If need join values by some separator e.g. ,:

d1 = df.groupby('line')['stop'].apply(','.join).to_frame().to_dict('index')
print (d1)
{1: {'stop': '1_a,1_b,1_c'}, 2: {'stop': '2_a,2_c'}}

EDIT:

Solution for multiple columns with GroupBy.agg and omited to_frame():

print (df)

   line stop  lat  lon
0     1  1_a    2    2
1     1  1_b    3    1
2     1  1_c    4    3
3     2  2_a    5    6
4     2  2_c    6    6

d = df.groupby('line')[['stop','lat','lon']].agg(list).to_dict('index')
print (d)
{1: {'stop': ['1_a', '1_b', '1_c'], 'lat': [2, 3, 4], 'lon': [2, 1, 3]},
 2: {'stop': ['2_a', '2_c'], 'lat': [5, 6], 'lon': [6, 6]}}

Upvotes: 1

sammywemmy
sammywemmy

Reputation: 28644

You could avoid the to_dict part and iterate through the grouping to get your dictionary, since you are not doing any computations :

{key: {"stops": ",".join(value.stop.array)}
 for key, value in df.groupby("line")}


{1: {'stops': '1_a,1_b,1_c'}, 2: {'stops': '2_a,2_c'}}

Or you could leave the sub values as a list:

{key: {"stops": list(value.stop.array)} 
 for key, value in df.groupby("line")}

{1: {'stops': ['1_a', '1_b', '1_c']}, 2: {'stops': ['2_a', '2_c']}}

Upvotes: 1

Related Questions