Bamir
Bamir

Reputation: 71

Adding column to empty DataFrames via a loop

I have the following code:

for key in temp_dict:
temp_dict[key][0][0] = temp_dict[key][0][0].insert(0, "Date", None)

where temp_dict is:

    {'0.5SingFuel': [[Empty DataFrame
Columns: [Month, Trades, -0.25, -0.2, -0.15, -0.1, -0.05, 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, Total]
Index: []]], 'Sing180': [[Empty DataFrame
Columns: [Month, Trades, -0.25, -0.2, -0.15, -0.1, -0.05, 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, Total]
Index: []]], 'Sing380': [[Empty DataFrame
Columns: [Month, Trades, -0.25, -0.2, -0.15, -0.1, -0.05, 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, Total]
Index: []]]}

What I would like to have is:

{'0.5SingFuel': [[Empty DataFrame
Columns: [Date, Month, Trades, -0.25, -0.2, -0.15, -0.1, -0.05, 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, Total]
Index: []]], 'Sing180': [[Empty DataFrame
Columns: [Date, Month, Trades, -0.25, -0.2, -0.15, -0.1, -0.05, 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, Total]
Index: []]], 'Sing380': [[Empty DataFrame
Columns: [Date, Month, Trades, -0.25, -0.2, -0.15, -0.1, -0.05, 0.0, 0.05, 0.1, 0.15, 0.2, 0.25, Total]
Index: []]]}

My code produces the following error:

ValueError: cannot insert Date, already exists

I would have thought that I was looping from one dict key to the next, but I was going through the debugger and it looks like:

This probably makes no sense, hence why I need some help - I am confused.

I think I am mis-assigning the variables, but not completely sure how.

Upvotes: 1

Views: 216

Answers (1)

Ben.T
Ben.T

Reputation: 29635

One problem is that insert is kind of an inplace operation, so you don't need to reassign. The second problem is if the column exists, then insert does not work as you said, so you need to check if it is in the columns already, and maybe reorder to put this column as first.

# dummy dictionary, same structure
d = {0:[[pd.DataFrame(columns=['a','b'])]], 
     1:[[pd.DataFrame(columns=['a','c'])]]}

# name of the column to insert
col='c'

for key in d.keys():
    df_ = d[key][0][0] # easier to define a variable
    if col not in df_.columns:
        df_.insert(0,col,None)
    else: # reorder and reassign in this case, remove the else if you don't need
        d[key][0][0] = df_[[col] + df_.columns.difference([col]).tolist()]
print(d)
# {0: [[Empty DataFrame
# Columns: [c, a, b]                 # c added as column
# Index: []]], 1: [[Empty DataFrame
# Columns: [c, a]                    # c in first position now
# Index: []]]}

Upvotes: 1

Related Questions