Reputation: 13118
I got this error but I don't what causes it. My python code ran in pyspark. The stacktrace is long and i just show some of them. All the stacktrace doesn't show my code in it so I don't know where to look for. What is possible the cause for this error?
/usr/hdp/2.4.2.0-258/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
306 raise Py4JJavaError(
307 "An error occurred while calling {0}{1}{2}.\n".
--> 308 format(target_id, ".", name), value)
309 else:
310 raise Py4JError(
Py4JJavaError: An error occurred while calling o107.parquet.
...
File "/usr/hdp/2.4.2.0-258/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 435, in toInternal
return self.dataType.toInternal(obj)
File "/usr/hdp/2.4.2.0-258/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 172, in toInternal
return d.toordinal() - self.EPOCH_ORDINAL
AttributeError: 'unicode' object has no attribute 'toordinal'
Thanks,
Upvotes: 4
Views: 2668
Reputation: 1121226
The specific exception is caused by trying to store a unicode
value in a date datatype that is part of a struct. The conversion of the Python type to Spark internal representation expected to be able to call date.toordinal()
method.
Presumably you have a dataframe schema somewhere that consists of a struct type with a date field, and something tried to stuff a string into that.
You can trace this based on the traceback you do have. The Apache Spark source code is hosted on GitHub, and your traceback points to the pyspark/sql/types.py
file. The lines point to the StructField.toInternal()
method, which delegates to the self.dataType.toInternal()
method:
class StructField(DataType):
# ...
def toInternal(self, obj):
return self.dataType.toInternal(obj)
which in your traceback ends up at the DateType.toInternal()
method:
class DateType(AtomicType):
# ...
def toInternal(self, d):
if d is not None:
return d.toordinal() - self.EPOCH_ORDINAL
So we know this is about a date field in a struct. The DateType.fromInternal()
shows you what Python type is produced in the opposite direction:
def fromInternal(self, v):
if v is not None:
return datetime.date.fromordinal(v + self.EPOCH_ORDINAL)
It is safe to assume that toInternal()
expects the same type when converting in the other direction.
Upvotes: 5