rafvasq
rafvasq

Reputation: 1522

NotImplementedError: Invalid returnType with grouped map Pandas UDFs

My output_schema for a Pandas UDF contains the following fields:

Out[183]: [StructField(id,StringType,true),
 StructField(2018-01-01,StructType(List(StructField(real,FloatType,true),StructField(imag,FloatType,true))),true),
 StructField(2018-01-02,StructType(List(StructField(real,FloatType,true),StructField(imag,FloatType,true))),true),
 StructField(2018-01-03,StructType(List(StructField(real,FloatType,true),StructField(imag,FloatType,true))),true),
 StructField(2018-01-04,StructType(List(StructField(real,FloatType,true),StructField(imag,FloatType,true))),true),
 StructField(2018-01-05,StructType(List(StructField(real,FloatType,true),StructField(imag,FloatType,true))),true),
 StructField(2018-01-06,StructType(List(StructField(real,FloatType,true),StructField(imag,FloatType,true))),true),
 StructField(2018-01-07,StructType(List(StructField(real,FloatType,true),StructField(imag,FloatType,true))),true),
 StructField(2018-01-08,StructType(List(StructField(real,FloatType,true),StructField(imag,FloatType,true))),true),
 ...

and is of type: Out[185]: pyspark.sql.types.StructType

What I'm trying to output is a column with an id while the rest of the columns are tuples which hold two floats. My code for defining the schema is below and basically defines the StructType() tuple for every column which isn't the id.

fields = []
for f in json.loads(skeleton_schema.json())["fields"]:
  if f["name"] != "id":
    fields.append(StructField(f["name"], StructType([ 
          StructField(FloatType(), True),
          StructField(FloatType(), True)
        ]), True))
  else:
      fields.append(StructField.fromJson(f))
output_schema = StructType(fields)

However, when running my UDF I receive a NotImplementedError and the output prints my entire schema and says it's not supported. What exactly isn't supported and what am I doing wrong?

Upvotes: 6

Views: 2462

Answers (1)

rafvasq
rafvasq

Reputation: 1522

After more debugging, I found out that Nested StructTypes are not supported. The supported types are found here.

Upvotes: 10

Related Questions