Volodymyr Prokopyuk
Volodymyr Prokopyuk

Reputation: 353

Abstract out Polars expressions with user-defined chainable functions on the `DataFrame`

Motivation

Abstract out parametrized (via custom function parameters) chainable (preferably via the DataFrame.prototype) Polars expressions to provide user-defined, higher-level, reusable and chainable data analysis functions on the DataFrame

Desired behavior and failed intent

import pl from "nodejs-polars"
const { DataFrame, col } = pl

// user-defined, higher-level, reusable and chainable data analysis function
// with arbitrary complex parametrized Polars expressions
DataFrame.prototype.inc = function inc(column, x = 1, alias = `${column}Inc`) {
  return this.withColumn(col(column).add(x).alias(alias))
}
const df = new DataFrame({ a: [1, 2, 3] })
// works, but implies code duplication on reuse
console.log(df.withColumn(col("a").add(1).alias("aInc")))
// desiged behavior gives TypeError: df.inc is not a function
console.log(df.inc("a").inc("a", 2, "aInc2"))

What it the recommended way to define custom functions that encapsulate Polars expressions in nodejs-polars?

Upvotes: 2

Views: 246

Answers (2)

Cory Grinstead
Cory Grinstead

Reputation: 661

a functional approach that does not require additional libraries would be to create a simple wrapper function & reexport polars with an overridden DataFrame method within your own package.

// polars_extension.js
import pl from 'nodejs-polars'

const customMethods = {
  sumAlias() {
    return this.sum();
  },
};

export default {
  ...pl,
  DataFrame(...args) {
    return Object.assign(pl.DataFrame(...args), customMethods);
  }
}

// another_file.js
import pl from './polars_extension'

pl.DataFrame({num: [1, 2, 3]}).sumAlias()

Upvotes: 1

Volodymyr Prokopyuk
Volodymyr Prokopyuk

Reputation: 353

Prototype-based solution

function DF(df) { this.df = df }
DF.prototype.inc = function inc(column, x = 1, alias = `${column}Inc`) {
  this.df = this.df.withColumn(col(column).add(x).alias(alias))
  return this
}
const df = new DF(new DataFrame({ a: [1, 2, 3] }))
console.log(df.inc("a").inc("a", 2, "aInc2"))

Functional programming solution (preferred)

import { curry, pipe } from "rambda"
function inc(column, x, alias, df) {
  return df.withColumn(col(column).add(x).alias(alias))
}
const makeInc = curry(inc)
const df = new DataFrame({ a: [1, 2, 3] })
console.log(pipe(makeInc("a", 1, "aInc"), makeInc("a", 2, "aInc2"))(df))

Output

shape: (3, 3)
┌─────┬──────┬───────┐
│ a   ┆ aInc ┆ aInc2 │
│ --- ┆ ---  ┆ ---   │
│ f64 ┆ f64  ┆ f64   │
╞═════╪══════╪═══════╡
│ 1.0 ┆ 2.0  ┆ 3.0   │
├╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 2.0 ┆ 3.0  ┆ 4.0   │
├╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ 3.0 ┆ 4.0  ┆ 5.0   │
└─────┴──────┴───────┘```

Upvotes: 1

Related Questions