mattie_g
mattie_g

Reputation: 95

What could be driving a SyntaxError when trying to combine two columns to create dataframe names?

I am aiming to develop multiple dataframe names using two columns from my source dataframe as naming conventions for each col1 col2 combination .

For instance, if period and dps are columns in the source dataframe I want to create dataframes for each period-dps combination like so:

period = ['a','b','c']
dps = ['x','y','z']

for d in dps:
    for p in period:
        exec('{}{} = pd.DataFrame()'.format(p,d))

This code works fine as tested, but when I incorporate my actual data I get a SyntaxError: invalid syntax error.

My question is what could be driving this error? Is there a possible issue with my original data I should review and clean first?

Thank you

Upvotes: 0

Views: 36

Answers (2)

DBA108642
DBA108642

Reputation: 2112

You can use the following code:

df_dict = {}
for p,d in zip(period,dps):
    name = p+d
    df = pd.DataFrame()
    df_dict[name] = df

This will return a dictionary of data frames, each named p+d and avoids using a nested for loop

Upvotes: 0

chepner
chepner

Reputation: 532003

Don't use exec. Create a dict to store your dataframes.

period = ['a','b','c']
dps = ['x','y','z']

frames = {}
for d in dps:
    for p in period:
        frames[f'{p}{d}'] = pd.DataFrame()

You might also consider nested dicts.

from collections import defaultdict

frames = defaultdict(dict)
for d in dps:
    for p in period:
        frames[p][d] = pd.DataFrame()

Upvotes: 1

Related Questions