What is wrong with this lambda function? Pandas and Python dataframe

Question

I wrote a lambda function that should be fast, but this is taking a very long time. Is there a better way to write this?

fn = lambda x: shape(df[df.CustomerCard_Num == x.CustomerCard_Num])[0]
df['tottrans'] = df.apply(fn, axis = 1)

Basically, I have a big database of transactions (rows). A set of rows might correspond to different customers (Customer card number if a column in df, multiple rows might have the same df.CustomerCard_Num.)

I am trying to count the number of rows for each customer with this lambda function. But it does not seem to work quickly. Should I be using groupby?

EdChum · Accepted Answer

There is a built in way:

df.CustomerCard_Num.value_counts()

See the docs

What is wrong with this lambda function? Pandas and Python dataframe

Answers (1)

Related Questions