Can these for loops be refactored?

Question

I am wondering if there is a cleaner, more efficient way to do this. I currently use two for loops to do the following:

data = {'orig_state': ['TN','TN','TN','TX','TX','IL'],
        'orig_state_fn': ['Tennessee','Tennessee','Tennessee','Texas','Texas','Illinois'],
        'dest_state': ['CA','TN','TN','TX','IL','CA']
       }
df = pd.DataFrame(data,columns=['orig_state','orig_state_fn','dest_state'])

state_options = []
for state in df['orig_state'].unique():
    state_options.append({'label': str(df[df['orig_state'] == state]['orig_state_fn'].unique())+" "+str(df[df['orig_state'] == state]['dest_state'].count())                      
                      +" Packages",'value':state})    
for i in range(len(state_options)):
    state_options[i]['label'] = state_options[i]['label'].replace("['", "").replace("']", "")

Output:

state_options>>

[{'label': 'Tennessee 3 Packages', 'value': 'TN'},
 {'label': 'Texas 2 Packages', 'value': 'TX'},
 {'label': 'Illinois 1 Packages', 'value': 'IL'}]

BENY · Accepted Answer

We can do groupby

df1=df.groupby('orig_state').\
        apply(lambda x : x['orig_state_fn'].unique()[0]+' '+str(len(x))+' packages').reset_index()
df1.columns=['value','label']

l=df1.to_dict('r')

Can these for loops be refactored?

Answers (2)

Related Questions