pettinato
pettinato

Reputation: 1542

How do I add custom transformations to a PMML using R?

I am trying to add a function of two variables into a PMML in R.

The model I am trying to execute is

y = a + b*exp(Sepal.Width - Sepal.Length)^2

And I want the input to the PMML to be Sepal.Width and Sepal.Length.

I have the following code to make field derived_Sepal.Length, but I can't figure out how to use a custom transformation function such as exp(Sepal.Width - Sepal.Length)^2.

library(pmml)
library(XML)
library(pmmlTransformations)
irisBox <- WrapData(iris)
irisBox <- ZScoreXform(irisBox,"Sepal.Length")

model <- lm(Petal.Width ~ derived_Sepal.Length - Sepal.Width, data=irisBox$data)
pmmlModel <- pmml(model,transforms=irisBox)

pmmlModelEnhanced <- addLT(pmmlModel,namespace="4_2")
saveXML(pmmlModelEnhanced, file=outputPMMLFilename)

Any general advice or tips on doing data transformations in PMML using R would also be appreciated.

Thanks!

Upvotes: 2

Views: 1275

Answers (1)

user1808924
user1808924

Reputation: 4926

Currently, there are no ready to use tools for transforming arbitrary R expressions to PMML. You will have to compose the PMML snippet manually using generic R XML API, and attach it to the PMML document before it is written to a file.

Let's assume that you want to use a derived field my_field:

my_field = (Sepal.Length - Sepal.Width)^2
# Use the my_field in your formula
lm = lm(Species ~ my_field, data = iris)
# Convert the lm() object to an in-memory XML DOM object
lm.pmml = pmml(lm)
# Fix the contents of the PMML/DataDictionary:
# 1) Remove the 'my_field' field definition
# 2) Add `Sepal.Length` and `Sepal.Width` field definitions - you will be referencing them in your custom expression, so they need to be available
lm.pmml = remove_datafield(lm.pmml, "my_field")
lm.pmml = add_datafield(lm.pmml, "Sepal.Width", "double", "continuous")
lm.pmml = add_datafield(lm.pmml, "Sepal.Length", "double", "continuous")
# Fix the contents of the PMML/TransformationDictionary:
# 1) Add 'my_field' field definition
lm.pmml = add_derivedfield(lm.pmml, ..)
# The PMML manipulation is done now, save it to a local filesystem file
saveXML(lm.pmml, outputPMMLFilename)

Going forward, you might want to keep an eye on the JPMML-Converter project, because the automated R to PMML translation is a planned feature there.

Upvotes: 1

Related Questions