prasad
prasad

Reputation: 229

Memory errors with itertools and pandas?

I am trying to generate following stepped sequence pattern but python throws MemoryError

import numpy as np
import pandas as pd
import itertools

Temp = np.linspace(-5,5,pow(2,16))

df = pd.DataFrame([Temp*2] , index=['ColA','ColB']).T

print df

df2 = pd.DataFrame([e for e in itertools.product(df.ColA,df.ColB)],columns=df.columns)

print df2

Errors

df2 = pd.DataFrame([e for e in itertools.product(df.ColA,df.ColB)],columns=df.columns)
MemoryError

Please let me know how I can fix this?

Upvotes: 0

Views: 156

Answers (1)

Stefan
Stefan

Reputation: 42905

With power=16 and itertools.product (yielding the cartesian product), you are creating a list of (2*2)^16=4,294,967,296 tuples, or rows in your DataFrame. Do you want that long a sequence?

power = 16
for i in range(power):
    Temp = np.linspace(-5, 5, pow(2, i))
    df = pd.DataFrame([Temp] * 2, index=['ColA','ColB']).T
    print(i, len(df), len(list(product(df.ColA, df.ColB))))

0 1 1
1 2 4
2 4 16
3 8 64
4 16 256
5 32 1024
6 64 4096
7 128 16384
8 256 65536
9 512 262144
10 1024 1048576
11 2048 4194304
12 4096 16777216
13 8192 67108864
14 16384 268435456
...

Upvotes: 2

Related Questions