Spark create dataframe with a column mixed of integer and float numers

I want to create an spark data frame with a column of numbers, which some of them are integer and others are float:

tmp = spark.createDataFrame([1.0, 2.1, 3], IntegerType()).toDF('bins')

It raises this error:

TypeError: field value: IntegerType can not accept object 1.0 in type <class 'float'>

how can I create a data frame with 1.0, 2.1, 3 in a column? It should be 3 and can't be cast to float like: 3.0. also 2.1 can't be cast to 2. if I use this command instead:

tmp = spark.createDataFrame([1.0, 2.1, 3], FloatType()).toDF('bins')

It raises this error:

TypeError: field value: FloatType can not accept object 3 in type <class 'int'>

how can I creat this data frame?

Upvotes: 2

Views: 2314

Answers (1)

Steven
Steven

Reputation: 15258

here could be your solution :

from pyspark.sql import functions as F, Window as W, types as T

tmp = spark.createDataFrame(map(float, [1.0, 2.1, 3]), T.FloatType()).toDF("bins")

or another one :

tmp = (
    spark.createDataFrame([1.0, 2.1, 3], T.StringType())
    .toDF("bins")
    .withColumn("bins", F.col("bins").cast(T.FloatType()))
)

Upvotes: 3

Related Questions