Chia Yi
Chia Yi

Reputation: 562

pandas apply typeError: 'float' object is not subscriptable

I have a dataframe df_tr like this:

      item_id    target   target_sum  target_count
0        0          0           1            50            
1        0          0           1            50              

I'm trying to find the mean of the target but excluding the target value of the current row, and put the mean value in a new column. The result would be:

     item_id    target   target_sum  target_count item_id_mean_target
0        0          0           1            50           0.02041
1        0          0           1            50           0.02041

where I computed item_id_mean_target value from the formula:

target_sum - target/target_count - 1

...with this code:

df_tr['item_id_mean_target'] = df_tr.target.apply(lambda x: (x['target_sum']-x)/(x['target_count']-1))     

I think my solution is correct but instead I got:

TypeError: 'float' object is not subscriptable                

Upvotes: 5

Views: 15639

Answers (3)

Alexander
Alexander

Reputation: 109546

Ignoring the sum and count columns and using groupby to derive them:

df_tr.groupby('item_id').apply(lambda x: (x['target'].sum() - x['target'])     
                                         / (x['target'].count() - 1))

You may also notice the issue in your original statement where you had x['target_sum']-x. It should have been x['target_sum']-x['target'].

Upvotes: 1

cs95
cs95

Reputation: 402503

No need for apply here, pandas (and therefore numpy) broadcasts operations.

df['item_id_mean_target'] = (df.target_sum - df.target) / (df.target_count - 1)

df

   item_id  target  target_sum  target_count  item_id_mean_target
0        0       0           1            50             0.020408
1        0       0           1            50             0.020408

As for why your error occurs, you are calling apply on a pd.Series object, therefore, you cannot reference any other columns inside the apply (since it only receives scalar values).

To fix it, you'd need to do df.apply(...) but at that point, you're penalised with low performance, so, I wouldn't recommend doing it.

Upvotes: 4

Advay Umare
Advay Umare

Reputation: 432

try this:

df_tr.apply(lambda x:(x['target_sum']-x)/(x['target_count']-1), axis=1)

Upvotes: 0

Related Questions