user4279562
user4279562

Reputation: 669

Converting pandas dataframe to a dictionary with a new key name

I know how to convert a dataframe to a dictionary but I'm not sure how I can create a dictionary with an arbitrary key name added.

Let's say I have a dataframe like the following.

raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'],
'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'],
'name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'],
'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}

df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'name', 'preTestScore', 'postTestScore'])

df.head()
Out[96]: 
     regiment company      name  preTestScore  postTestScore
0  Nighthawks     1st    Miller             4             25
1  Nighthawks     1st  Jacobson            24             94
2  Nighthawks     2nd       Ali            31             57
3  Nighthawks     2nd    Milner             2             62
4    Dragoons     1st     Cooze             3             70

I want to groupby 'name' and compute maximum in 'preTestScore' and finally create a dictionary as following.

{'Miller': {'maxTestScore': 4},
 'Jacobson': {'maxTestScore': 24}, ...}

Here, I added a new key name 'maxTestScore'. How could I achieve this with any arbitrary key name? Thank you so much in advance.

Upvotes: 1

Views: 47

Answers (1)

jezrael
jezrael

Reputation: 862691

You can use dict comprehension with groupby:

d = {k:{'maxTestScore':v.max()} for k,v in df.groupby('name')['preTestScore']}
print (d)

{'Piger':   {'maxTestScore': 3}, 
 'Milner':  {'maxTestScore': 2}, 
 'Sone':    {'maxTestScore': 31}, 
 'Jacon':   {'maxTestScore': 4},
 'Cooze':   {'maxTestScore': 3}, 
 'Sloan':   {'maxTestScore': 2},
 'Riani':   {'maxTestScore': 2}, 
 'Miller':  {'maxTestScore': 4}, 
 'Ali':     {'maxTestScore': 31}, 
 'Ryaner':  {'maxTestScore': 24}, 
 'Jacobson':{'maxTestScore': 24}}

Another solution:

d = {k:{'maxTestScore':v} for k,v in df.groupby('name')['preTestScore'].max().iteritems()}
print (d)

{'Piger':   {'maxTestScore': 3}, 
 'Milner':  {'maxTestScore': 2}, 
 'Sone':    {'maxTestScore': 31}, 
 'Jacon':   {'maxTestScore': 4},
 'Cooze':   {'maxTestScore': 3}, 
 'Sloan':   {'maxTestScore': 2},
 'Riani':   {'maxTestScore': 2}, 
 'Miller':  {'maxTestScore': 4}, 
 'Ali':     {'maxTestScore': 31}, 
 'Ryaner':  {'maxTestScore': 24}, 
 'Jacobson':{'maxTestScore': 24}}

Upvotes: 2

Related Questions