Taipy state: Why isn't my pandas dataframe updating and accessible outside of my function?

Question

To preface, I'm by no means a developer but I'm able to tinker with python successfully on small projects like this. I'm trying to build a web app that let's me upload a CSV file which the app processes and displays information in a dashboard. The columns for the CSV are pretty much fixed (this is for internal use only).

I've created a Taipy file selector to upload a file from my computer and on_change will trigger it to go to my function that reads the uploaded file and puts it in a dataframe. Pretty standard stuff.

My problem is, once the file has been read by the function, I can't access it in the main code body to display. I wrote in a print(df.head()) call to see if the df will at least show up in the console but I get an AttributeError because the df is still showing None type the way it was initialised. When I run print(df.head()) from within the function, the df can be read and writes to the console like it should.

Here's my code. Can anyone advice what went wrong? Thank you! Apologies if this is all too rudimentary (noob):

from taipy import Gui
import taipy.gui.builder as tgb
import pandas as pd
from wordcloud_taipy import wcg_page


#Initialise variables

uploaded_file = None
df = None
total_coverage = None
total_events = None
total_ave = None
total_prv = None
total_daily_coverage = None
coverage_dist = None
coverage_event_releases = None
#dataframe = None


#Declare functions

def file_upload(state):
    state.df = process_my_file(state.uploaded_file)
    

def process_my_file(uploaded_file):
    df = pd.read_csv(uploaded_file, thousands=',')

    # Check if columns exist before converting
    df['Number'] = pd.to_numeric(df['Number'], errors='coerce').fillna(0).astype('int64')  # Convert to numeric and fill NaNs
    df['Tier'] = pd.to_numeric(df['Tier'], errors='coerce').fillna(0).astype('int64')  # Convert to numeric and fill NaNs
    df['ASR(MYR)'] = pd.to_numeric(df['ASR(MYR)'], errors='coerce')  # Convert to numeric
    df['PRV (MYR)'] = pd.to_numeric(df['PRV (MYR)'], errors='coerce')  # Convert to numeric
    df['Date'] = pd.to_datetime(df['Date'], format='%d-%m-%y', errors='coerce')  # Convert to datetime, handle errors

    return df  
       
    
with tgb.Page() as dash_page:
    tgb.text("# Media Coverage Dashboard", mode="md")
    tgb.file_selector("{uploaded_file}", label="Upload your CSV file", extensions=".csv", on_action=file_upload)
        
    print(df.head()) # Test if the df is visible
    
    total_coverage = df['Media Outlet'].count()
    total_events = len(df[df['Event'] != 'General']['Event'].unique())
    total_ave = df['ASR(MYR)'].sum()
    total_prv = df['PRV (MYR)'].sum()
    total_daily_coverage = df.groupby('Date')['Media Outlet'].count().reset_index()
    coverage_dist = df.groupby(df['Event'].apply(lambda x: 'General' if x == 'General' else 'Others'))['Media Outlet'].count().reset_index()
    coverage_event_releases = df[df['Event'] != 'General'].groupby(['Event', 'Tier'])['Media Outlet'].count().reset_index()
    

    tgb.table("{df}", page_size="5", rebuild=True) #, filter=True)
    
    # Key metric KPI scorecards
    with tgb.layout(columns="1 1 1 1", gap="15px"):
        with tgb.part("card"):
            tgb.text("## **Total Coverage**", mode="md")
            tgb.text("{total_coverage}", class_name="h2")
        with tgb.part("card"):
            tgb.text("## **Total Events**", mode="md")
            tgb.text("{total_events}", class_name="h2")
        with tgb.part("card"):
            tgb.text("## **Total AVE**", mode="md")
            tgb.text("MYR {total_ave}", format="%.2f", class_name="h2")
        with tgb.part("card"):
            tgb.text("## **Total PRV**", mode="md")
            tgb.text("MYR {total_prv}", format="%.2f", class_name="h2")
    
    tgb.html("br")
    # Charts
    with tgb.layout(columns="1 1 1", gap="30px"):
        with tgb.part():
            tgb.text("#### Total Daily Coverage", mode="md")
            tgb.chart("{total_daily_coverage}", mode="lines", x="Date", y="Media Outlet", rebuild=True)
        with tgb.part():
            tgb.text("#### Coverage Distribution", mode="md")
            tgb.chart("{coverage_dist}", type="pie", values="Media Outlet", names="Event Category")
        with tgb.part():
            tgb.text("#### Coverage by Events and Releases", mode="md")
            layout={ "barmode": "stack" }
            tgb.chart("{coverage_event_releases}", type="bar", xaxis_title="Event", yaxis_title="Coverage", layout="{layout}", color="Tier")

    

pages = {"/":"<|toggle|theme|>

<|navbar|>
",
        "Dashboard":dash_page,
        "WordCloud":wcg_page}

Gui(pages=pages).run(debug=True, use_reloader=True, title="Media Coverage Dashboard")

The logic for manipulating the df works - I've got it to work on basic python and then streamlit for a web interface. However, streamlit is good for prototyping and I wanted to expand this to allow more colleagues to access at once (multiuser) and even introduce sqlite to store the uploaded data hence I thought I should be using something more capable and landed at taipy for now.

I've explored full on flask and django but building HTML, CSS and JS are not my strong suite and I didn't want to end up with an ugly interface.

Taipy state: Why isn't my pandas dataframe updating and accessible outside of my function?

Answers (1)

Related Questions

Taipy state: Why isn&#39;t my pandas dataframe updating and accessible outside of my function?

Answers (1)

Related Questions

Taipy state: Why isn't my pandas dataframe updating and accessible outside of my function?