Reputation: 205
I need to create dataframe based on the set of columns names and data types. But data types are given in str, int, float etc.. but I need to convert these to StringType, IntegerType etc.. needed for StructType/StructField.
I can create simple mapping do the job but I like to know if there any automatic conversion of these type?
Upvotes: 0
Views: 3445
Reputation: 159
You can do that by using the following function:
>>> from pyspark.sql.types import _infer_type
>>> _infer_type([1.0, 2.0])
ArrayType(DoubleType,true)
If you have the type directly in the input you can also do this:
>>> my_type = type(42)
>>> _infer_type(my_type())
LongType
Finally, If you only have a string describing the python type you can use this:
>>> from pydoc import locate
>>> _infer_type(locate('int'))
LongType
Sources:
Upvotes: 1
Reputation: 86
I know it's been long, but you can try the following:
from pyspark.sql.types import _parse_datatype_string
then you can use it as follows:
_parse_datatype_string('int') # Will convert it to IntegerType of pyspark
NOTE: The type has to be in String format
Reference: https://spark.apache.org/docs/2.4.0/api/python/_modules/pyspark/sql/types.html
Upvotes: 4