Bebio
Bebio

Reputation: 409

How to save pandas dataframe output from function to workspace?

I have the following simple function importing file from csv and I would like to store the resulting dataframe to the Python worskpace. Indeed the DF created should be used as input for another functions of my code.

I tried to declare the dataframe as global but it's still not "visible"

def tabpol(annee):

    # Import de la table polices
    import pandas as pd
    global table_polices
    
    pd.options.mode.chained_assignment = None
    libtab="C:/Users/a61787/Documents/NBV/NBV_2020/Modele/Table_"+str(annee)+"/"
        
    table_polices=pd.read_csv(libtab+'pol.csv')
    return table_polices

Any help you could have will be appreciated Thanks

=> You're right, I need to delete the return argument. Doing so, I have what I want using the function directly

But actually, this function is stored in a module named "NBV" and when I call the function from the module, the dataframe is not visible

from NBV import tabpol
tabpol(2019)
print(table_polices)

NameError: name 'table_polices' is not defined

Upvotes: 1

Views: 864

Answers (1)

Giorgos Myrianthous
Giorgos Myrianthous

Reputation: 39860

So in Python, global variables are only visible inside the module they are defined and not across all the modules of your Python application.

To be more precise, function tabpol() is defined in a module called NBV that also creates a global variable table_polices. The latter will be visible across the whole module (even outside of function tabpol()), but it won't be visible outside of this module. This is why you are receiving a NameError: name 'table_polices' is not defined.


There are various workarounds but it would be risky to suggest one as this hugely depends on the design of your application. Just to give an example though, here's one alternative approach that can do the trick:

# NBV.py
import pandas as pd


table_polices_2019 = tabpol(2019)


def tabpol(annee):
    pd.options.mode.chained_assignment = None
    libtab="C:/Users/a61787/Documents/NBV/NBV_2020/Modele/Table_"+str(annee)+"/"
    df = pd.read_csv(libtab+'pol.csv')
    
    return df

and you can reference table_polices_2019 using NBV.table_polices_2019.

But again, this doesn't sound quite right to me. I am also not sure whether a global variable will improve the current design but I don't think so. I understand that you don't want to create a pandas dataframe from the fixed csv file over and over again but turning this pandas dataframe into a global variable doesn't sound like a good solution to me (unless I am missing anything here)

Upvotes: 1

Related Questions