user2597302
user2597302

Reputation: 35

Streamlit segmentation faults when multiple windows are open for real-time data display web app

I am building a web app in streamlit that

  1. periodically loads a data file that a separate process updates
  2. displays 9 matplotlib figures using st.pyplot
  3. allows the user to select the ID of the data to display

It runs fine when one user is using the web app in one window. But if I open another tab, it segfaults shortly after.

I think the problem is especially bad when when both tabs are trying to load the data simultaneously. The reason I think this is because if I open another tab "staggered" from the first one, it can successfully run a bit before segfaulting. But if the tabs both start open, it segfaults instantly. Matplotlib might also be involved. I am not sure, but I think if I reduce the number of plots, there are likewise more successful runs before segfaulting.

I have implemented this periodic loading both with streamlit_autorefresh and an infinite while loop, but both have this problem.

I'll post what-I-think-are-relevant parts of the code below, using only one of the plots:

import streamlit as st
from streamlit_autorefresh import st_autorefresh

...

count = st_autorefresh(interval=streamlit_run_every_secs * 1000, key="autorefresh_counter")

...

col_1_1, _, _ = st.columns(3)

with col_1_1:
    selected_product = st.selectbox(label='product', 
                                    options=tuple(products),
                                    on_change=None)
...

col_4_1, col_4_2, col_4_3 = st.columns(3)

with col_4_1:
    final_unhappiness_heatmap = st.empty()
    final_desired_move_ticks_heatmap = st.empty()

...

def load_results(p):
    print('load_results() begin')
    results_fp = os.path.join(os.path.join(ff_path, 'results'), f'{p}.csv')
    results_piv = pd.read_csv(results_fp, header=[0,1])

    results_piv = rename_unnamed(results_piv)

    results_piv.columns = [(col_lvl_0, float(col_lvl_1)) if col_lvl_1.replace('.', '').replace('-', '').isnumeric() else (col_lvl_0, col_lvl_1) for (col_lvl_0, col_lvl_1) in results_piv.columns]
    results_piv.columns = pd.MultiIndex.from_tuples(results_piv.columns)

    return results_piv
...

figs = []

def app_iteration():
    print(f"{pd.to_datetime('now')}: Running app_iteration()")
    global figs
    
    results_piv = load_results(selected_product)

    ...

    print(f'=== closing {len(figs)} figs ===')
    for fig in figs:
        print('closing fig')
        plt.close(fig)
    figs = []

    ...

    with final_unhappiness_heatmap.container():
        fig, ax = plt.subplots()
        figs.append(fig)
        sns.heatmap(results_piv.set_index('maturity').final_unhappiness, center=0, vmin=-1, vmax=1, cmap='coolwarm_r', annot=True, fmt=".2f", cbar=False, ax=ax)
        ax.set_title(f'{selected_product} Final Unhappiness')
        st.pyplot(fig)


...

app_iteration()

Upvotes: 1

Views: 2322

Answers (1)

servizz
servizz

Reputation: 21

I have the same issue with matplotlib and steamlit. Here is the solution (Limitations and known issues section): https://docs.streamlit.io/streamlit-cloud/troubleshooting[https://docs.streamlit.io/streamlit-cloud/troubleshooting][1]

Matplotlib doesn't work well with threads. So if you're using Matplotlib you should wrap your code with locks as shown in the snippet below. This Matplotlib bug is more prominent when you share your app apps since you're more likely to get more concurrent users then.

from matplotlib.backends.backend_agg import RendererAgg
_lock = RendererAgg.lock

with _lock:
  fig.title('This is a figure)')
  fig.plot([1,20,3,40])
  st.pyplot(fig)

Upvotes: 2

Related Questions