Why PySpark execute only the default statement in my custom `SQLTransformer`

Question

I wrote a custom SQLTransformer in PySpark. And setting a default SQL statement is mandatory to have the code being executed. I can save the custum transformer within Python, load it and execute it using Scala or/and Python but only the default statement is executed despite the fact that there is something else in the _transform method. I have the same result for both languages, then the problem is not related to _to_java method or JavaTransformer class.

class filter(SQLTransformer): 
    def __init__(self):
        super(filter, self).__init__() 
        self._setDefault(statement = "select text, label from __THIS__") 

    def _transform(self, df): 
        df = df.filter(df.id > 23)
        return df

Why PySpark execute only the default statement in my custom `SQLTransformer`

Answers (1)

Related Questions