Reputation: 4899
I am extending the polars
DataFrame and LazyFrame as described in the docs.
Let's go with their split
example for pl.DataFrame
. Let's say I also wanted to extend the pl.LazyFrame
with the same split
function.
The code would look pretty much the same, with the exception of the decorator (@pl.api.register_dataframe_namespace("split")
vs. @pl.api.register_lazyframe_namespace("split")
, the input argument (df
vs. ldf
) and the return type (list[pl.DataFrame]
vs. list[pl.LazyFrame]
).
This looks pretty much violating the DRY mantra.
What is best-practice to extend the API on multiple fronts (DataFrame, LazyFrame, Series)?
To put it differently, how can I apply an extension to both a pl.DataFrame
and a pl.LazyFrame
? And can this extension share the same namespace?
Upvotes: 2
Views: 251
Reputation: 18691
A decorator is just a convenient way to do decorate(fun)
so you can do something like this:
class SplitFrame:
def __init__(self, df: pl.DataFrame | pl.LazyFrame):
if isinstance(df, pl.DataFrame):
self._df=df.lazy()
self._was_df=True
else:
self._df = df
self._was_df=False
def by_alternate_rows(self) -> list[pl.DataFrame | pl.LazyFrame]:
df = self._df.with_row_index(name="n")
pre_return = [
df.filter((pl.col("n") % 2) == 0).drop("n"),
df.filter((pl.col("n") % 2) != 0).drop("n"),
]
if self._was_df is True:
return pl.collect_all(pre_return)
else:
return pre_return
pl.api.register_dataframe_namespace("split")(SplitFrame)
pl.api.register_lazyframe_namespace("split")(SplitFrame)
Note that each of those decorators actually return a decorator rather than being a decorator. When you use them in the normal decorator syntax then you don't notice this but in this case it's got the double parenthesis which looks odd.
Now you do can
df=pl.DataFrame({'a':[1,2,3,4]})
df.split.by_alternate_rows()
[shape: (2, 1)
┌─────┐
│ a │
│ --- │
│ i64 │
╞═════╡
│ 1 │
│ 3 │
└─────┘,
shape: (2, 1)
┌─────┐
│ a │
│ --- │
│ i64 │
╞═════╡
│ 2 │
│ 4 │
└─────┘]
or
df=pl.LazyFrame({'a':[1,2,3,4]})
df.split.by_alternate_rows()
[<LazyFrame at 0x7F697DFD67B0>, <LazyFrame at 0x7F697DFD61B0>]
Upvotes: 2