Reputation: 157
I am trying to concatenate a single row dataframe (df) and add it to the end of another dataframe (df_all) using the following code:
import pandas as pd
import numpy as np
from IPython.display import display, HTML
global df_all
df_all = pd.DataFrame()
def save_data(df):
df_all = pd.concat([df, df_all], axis=0)
display(df_all)
return df_all
def opt():
df = pd.DataFrame(np.random.randn(1, 4), columns=list('ABCD')) # one row data
display(df)
save_data(df)
Using this loop I suppose to get 3 rows are saved to df_all. But I am getting an error message (local variable 'df_all' referenced before assignment)
for i in range (3):
opt()
display(df_all)
Upvotes: 0
Views: 1397
Reputation: 109
This approach avoids use of global variable. Pass df_all as an argument to opt().
def save_data(df, df_all):
df_all = pd.concat([df, df_all], axis=0)
return df_all
def opt(df_all):
df = pd.DataFrame(np.random.randn(1, 4), columns=list('ABCD')) # one row data
df_all = save_data(df, df_all)
return df_all
df_all = pd.DataFrame()
for i in range(3):
df_all = opt(df_all)
display(df_all)
Upvotes: 0
Reputation: 402483
I don't believe in functions that rely on global variables—it just isn't good hygiene.
Functions should be pure. First, define your opt
function. This just generates df
and nothing more.
def opt():
df = ... # df is generated here
return df
Next, define save_data
. Well, I've renamed it to augment
to be more in line with what you're doing. This concatenates two DataFrames together.
def augment(df, df_new):
return pd.concat([df, df_new], axis=1)
Finally, your mainloop. All state is maintained here, not in the functions—
df_all = pd.DataFrame()
for i in range(3):
df_all = augment(df_all, opt())
display(df_all)
Upvotes: 2
Reputation: 1392
move global df_all
to save_data(df)
block. like below
def save_data(df):
global df_all
df_all = pd.concat([df, df_all], axis=0)
display(df_all)
return df_all
Upvotes: 0