Reputation: 62
I want to create python dictionary with pandas data frame column 2(source) and column 3(description) and group by column 1(title) Also, I want to get values of only provided titles titles = ['test1','test2']
title source description
1 Test1 ABC description1
2 Test2 ABC description2
3 Test2 DEF description3
4 Test3 XYZ description4
output = {'Test1':{'ABC':'description1'},'Test2':{'ABC':'description2':'DEF':'description3'}
Upvotes: 2
Views: 80
Reputation: 8302
try this,
result = {}
filter_ = ['Test1','Test2']
for x in df[df['title'].isin(filter_)].to_dict(orient='records'):
result.setdefault(x['title'], {}).update({x['source']: x['description']})
{'Test1': {'ABC': 'description1'}, 'Test2': {'ABC': 'description2', 'DEF': 'description3'}}
Upvotes: 0
Reputation: 862591
Use boolean indexing
with Series.isin
for filter first, then is used GroupBy.apply
with lambda function for Series
of dicts and last Series.to_dict
:
titles = ['Test1','Test2']
d = (df[df['title'].isin(titles)]
.groupby('title')[['source','description']]
.apply(lambda x: dict(x.to_numpy()))
.to_dict())
print (d)
{'Test1': {'ABC': 'description1'}, 'Test2': {'ABC': 'description2', 'DEF': 'description3'}}
Upvotes: 4
Reputation: 148
You can group by the dataframe w.r.t. title and then use python zip function to create inner dictionary with source and description. Please find below code for the same.
final_dict=dict()
all_groups = df.groupby('title')
for title in titles:
title_group = all_groups.get_group(title)
source_desc=dict(zip(title_group.source, title_group.description))
final_dict[title_group] = source_desc
print(final_dict)
Upvotes: 2