Create dynamic dataframe name by splitting a larger dataframe

Question

I have a large csv and would like to split it in e.g 4 parts with generated names in the loop e.g sub0,sub1,sub2,sub3. I can split routinely as following:

df=pd.DataFrame(np.random.randint(0,100,size=(20, 3)), columns=list('ABC'))

for i,chunk in enumerate(np.array_split(df, 4)):
    print(chunk.head(2)) #just to check
    print(chunk.tail(1)) #just to check

    sub+str(i)=chunk.copy() # this gives error

But with the assigning names in the last line, I get the expected error: SyntaxError: can't assign to operator.

Q: how to get sub0,..,sub3 by copying each chunk in the loop? Thank you!

Chris Adams · Accepted Answer

Best way is to create a dict with the dynamic names as keys:

chunks = {f'{sub}{i}':chunk for i, chunk in enumerate(np.array_split(df, 10))}

If you absolutely insist on creating the frames as individual variables, then you could assign them to the globals() dictionary, but this method is NOT advised:

for i, chunk in enumerate(np.array_split(df, 10)):
    globals()['{}{}'.format(sub, i)] = chunk

Create dynamic dataframe name by splitting a larger dataframe

Answers (2)

Related Questions