physiker
physiker

Reputation: 919

Create dynamic dataframe name by splitting a larger dataframe

I have a large csv and would like to split it in e.g 4 parts with generated names in the loop e.g sub0,sub1,sub2,sub3. I can split routinely as following:

df=pd.DataFrame(np.random.randint(0,100,size=(20, 3)), columns=list('ABC'))

for i,chunk in enumerate(np.array_split(df, 4)):
    print(chunk.head(2)) #just to check
    print(chunk.tail(1)) #just to check

    sub+str(i)=chunk.copy() # this gives error

But with the assigning names in the last line, I get the expected error: SyntaxError: can't assign to operator.

Q: how to get sub0,..,sub3 by copying each chunk in the loop? Thank you!

Upvotes: 1

Views: 2604

Answers (2)

Chris Adams
Chris Adams

Reputation: 18647

Best way is to create a dict with the dynamic names as keys:

chunks = {f'{sub}{i}':chunk for i, chunk in enumerate(np.array_split(df, 10))}

If you absolutely insist on creating the frames as individual variables, then you could assign them to the globals() dictionary, but this method is NOT advised:

for i, chunk in enumerate(np.array_split(df, 10)):
    globals()['{}{}'.format(sub, i)] = chunk

Upvotes: 1

Albert Alonso
Albert Alonso

Reputation: 656

Why would you want to create variables in a loop?

  • They are unnecessary: You can store everything in lists or any other type of collection
  • They are hard to create and reuse: You have to use exec or globals()

Using a list is much easier:

subs = []
for chunk in np.array_split(df, 10):
        print(chunk.head(2)) #just to check
        print(chunk.tail(1)) #just to check
        subs.append(chuck.copy())

Upvotes: 1

Related Questions