Samir Alhejaj
Samir Alhejaj

Reputation: 157

Concatenating DataFrames through function calls

I am trying to concatenate a single row dataframe (df) and add it to the end of another dataframe (df_all) using the following code:

import pandas as pd
import numpy as np
from IPython.display import display, HTML

global df_all    
df_all = pd.DataFrame()

def save_data(df):
     df_all = pd.concat([df, df_all], axis=0)
     display(df_all)
     return df_all

def opt():
    df = pd.DataFrame(np.random.randn(1, 4), columns=list('ABCD'))  # one row data
    display(df)
    save_data(df)

Using this loop I suppose to get 3 rows are saved to df_all. But I am getting an error message (local variable 'df_all' referenced before assignment)

for i in range (3):
    opt()
    display(df_all)

Upvotes: 0

Views: 1397

Answers (3)

aathiraks
aathiraks

Reputation: 109

This approach avoids use of global variable. Pass df_all as an argument to opt().

def save_data(df, df_all):
    df_all = pd.concat([df, df_all], axis=0)
    return df_all

def opt(df_all):
    df = pd.DataFrame(np.random.randn(1, 4), columns=list('ABCD'))  # one row data
    df_all = save_data(df, df_all)
    return df_all

df_all = pd.DataFrame()
for i in range(3):
    df_all = opt(df_all)
    display(df_all)

Upvotes: 0

cs95
cs95

Reputation: 402483

I don't believe in functions that rely on global variables—it just isn't good hygiene.

Functions should be pure. First, define your opt function. This just generates df and nothing more.

def opt():
    df = ...  # df is generated here
    return df

Next, define save_data. Well, I've renamed it to augment to be more in line with what you're doing. This concatenates two DataFrames together.

def augment(df, df_new):
    return pd.concat([df, df_new], axis=1)

Finally, your mainloop. All state is maintained here, not in the functions—

df_all = pd.DataFrame()
for i in range(3):
    df_all = augment(df_all, opt())
    display(df_all)

Upvotes: 2

Lambda
Lambda

Reputation: 1392

move global df_all to save_data(df) block. like below

def save_data(df):
    global df_all    
    df_all = pd.concat([df, df_all], axis=0)
    display(df_all)
    return df_all

Upvotes: 0

Related Questions