Reputation: 309
I am wondering if there is a cleaner, more efficient way to do this. I currently use two for loops to do the following:
data = {'orig_state': ['TN','TN','TN','TX','TX','IL'],
'orig_state_fn': ['Tennessee','Tennessee','Tennessee','Texas','Texas','Illinois'],
'dest_state': ['CA','TN','TN','TX','IL','CA']
}
df = pd.DataFrame(data,columns=['orig_state','orig_state_fn','dest_state'])
state_options = []
for state in df['orig_state'].unique():
state_options.append({'label': str(df[df['orig_state'] == state]['orig_state_fn'].unique())+" "+str(df[df['orig_state'] == state]['dest_state'].count())
+" Packages",'value':state})
for i in range(len(state_options)):
state_options[i]['label'] = state_options[i]['label'].replace("['", "").replace("']", "")
Output:
state_options>>
[{'label': 'Tennessee 3 Packages', 'value': 'TN'},
{'label': 'Texas 2 Packages', 'value': 'TX'},
{'label': 'Illinois 1 Packages', 'value': 'IL'}]
Upvotes: 4
Views: 68
Reputation: 28644
You do not need to take it into Pandas, compute and bring back into a dictionary. You can do all the computation within dictionary :
#create a pairing of the three values in the dictionary
m = zip(*data.values())
#create a dictionary from the pairing
from collections import defaultdict
d = defaultdict(list)
for k,v,s in m:
d[v].append(k)
print(d)
defaultdict(list,
{'Tennessee': ['TN', 'TN', 'TN'],
'Texas': ['TX', 'TX'],
'Illinois': ['IL']})
#now create the output in the form you desire
outcome = [{"label":f"{key} {len(value)} Packages",
"value" : value[0]}
for key, value in d.items()]
outcome
[{'label': 'Tennessee 3 Packages', 'value': 'TN'},
{'label': 'Texas 2 Packages', 'value': 'TX'},
{'label': 'Illinois 1 Packages', 'value': 'IL'}]
Upvotes: 1
Reputation: 323226
We can do groupby
df1=df.groupby('orig_state').\
apply(lambda x : x['orig_state_fn'].unique()[0]+' '+str(len(x))+' packages').reset_index()
df1.columns=['value','label']
l=df1.to_dict('r')
Upvotes: 2