user1095523
user1095523

Reputation: 83

Apply np.random.rand to groups - optimization issue

Need to optimize a single line of code that will be executed tens of thousands of times during the calculations and hence timing becomes an issue. Seems to be simple but really got stuck.

The line is:

df['Random']=df['column'].groupby(level=0).transform(lambda x: np.random.rand())

So I want to assign the same random number to each group and "ungroup". Since rand() is called many times using this implementation the code is very ineffective.

Can someone help in vectorizing this?

Upvotes: 1

Views: 67

Answers (1)

Venkatachalam
Venkatachalam

Reputation: 16966

Try this!

df = pd.DataFrame(np.sort(np.random.randint(2,5,50)),columns=['column'])
uniques =df['column'].unique()
final = df.merge(pd.Series(np.random.rand(len(uniques)),index=uniques).to_frame(),
                 left_on='column',right_index=True)

You can store the uniques and then run last line every time to get new random numbers and join with df.

Upvotes: 2

Related Questions