Jack Arno
Jack Arno

Reputation: 31

Pandas vectorized operations not working on large dataset

I ran the following code and found it to be working as expected on rather small datasets, but not on large ones. You can try it for yourself:

import pandas as pd
import numpy as np

# generating dataframe of one million observations
observations = 1000000
df = pd.DataFrame(np.random.randint(0,100,size=(observations, 1)), columns=['A'])

for i in range(50):
   if (df.A + 2).equals(df.A + 2) == False:
      print('why?')

On my machine, the string 'why?' gets printed about 4 times. I have no clue why I get this result, and I hope someone will shed light on the problem.

Upvotes: 1

Views: 86

Answers (1)

Jack Arno
Jack Arno

Reputation: 31

After completely de-installing all Python versions and packages, I re-installed Anaconda. This solved the issue for me. I don't know exactly the cause of the issue I had though... I must have messed with packages or versions of Python.

Thanks for the comments which helped me understand what was needed !

Upvotes: 2

Related Questions