Reputation: 600

How to use Polars with Plotly without converting to Pandas?

I would like to replace Pandas with Polars but I was not able to find out how to use Polars with Plotly without converting to Pandas. I wonder if there is a way to completely cut Pandas out of the process.

Consider the following test data:

import polars as pl
import numpy as np
import plotly.express as px

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)

fig = px.bar(df, x='names', y='random')
fig.show()

I would like this code to show the bar chart in a Jupyter notebook but instead it returns an error:

/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/polars/internals/frame.py:1483: UserWarning: accessing series as Attribute of a DataFrame is deprecated
  warnings.warn("accessing series as Attribute of a DataFrame is deprecated")

It is possible to transform the Polars data frame to a Pandas data frame with df = df.to_pandas(). Then, it works. However, is there another, simpler and more elegant solution?

Upvotes: 12

Answers (3)

alexander-beedie

Reputation: 881

FYI: plotly-express has just merged generic DataFrame support (via narwhals), meaning that Polars will be natively supported, so no more transforms to Pandas under the hood (and, as you might suspect, this comes with a nice plotting performance boost when using a Polars frame).

Upvotes: 3

franzito bambino

Reputation: 21

Currently making the switch to pola.rs from pandas. From my research your [] will work but is considered an anti-pattern in polars. This author suggests that you use the .to_series method.

px.pie(df,                                   # Polars DataFrame
   names = df.select('Model').to_series(),
   values = df.select('Sales').to_series(), 
   hover_name = df.select('Model').to_series(),
   color_discrete_sequence= px.colors.sequential.Plasma_r)

https://towardsdatascience.com/visualizing-polars-dataframes-using-plotly-express-8da4357d2ee0

When it comes to visualization of polar dataframe it seems you can't totally be rid of pandas dataframe conversion.

Hope this helped

Upvotes: 2

Wayne

Reputation: 9810

Yes, no need for converting to a Pandas dataframe. Someone (sa-) has requested supporting a better option here and included a workaround for it.

"The workaround that I use right now is px.line(x=df["a"], y=df["b"]), but it gets unwieldy if the name of the data frame is too big"

For the OP's code example, the approach of specifying the dataframe columns explicitly works.
I find in addition to specifying the dataframe columns with px.bar(x=df["names"], y=df["random"]) - or - px.bar(df, x=df["names"], y=df["random"]), casting to a list can also work:

import polars as pl
import numpy as np
import plotly.express as px

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "groups": ["A", "A", "B", "C", "B"],
    }
)

px.bar(df, x=list(df["names"]), y=list(df["random"]))

Knowing polars better, you may see some other options once you see the idea of the workaround.

The example posted there is simpler, instead of px.line(df, x="a", y="b") like you could use for a Pandas dataframe, you use px.line(x=df["a"], y=df["b"]). With polars, that is:

import polars as pl
import plotly.express as px

df = pl.DataFrame({"a":[1,2,3,4,5], "b":[1,4,9,16,25]})

px.line(x=df["a"], y=df["b"])

(Note that using plotly.express requires Pandas to be installed, see here and here. I used plotly.express in my answer because it was closer to the OP. The code could be adapted to using plotly.graph_objects if there was a desire to not have Pandas installed & involved at all.)

Upvotes: 12

How to use Polars with Plotly without converting to Pandas?

Answers (3)

Related Questions