Reputation: 429
I have pyspark dataframe with two columns with datatypes as
[('area', 'int'), ('customer_play_id', 'int')]
+----+----------------+
|area|customer_play_id|
+----+----------------+
| 100| 8606738 |
| 110| 8601843 |
| 130| 8602984 |
+----+----------------+
I want to cast column area to str using pyspark commands but I am getting error as below
I tried below
Any help will be appreciated I want datatype of area as string using pyspark dataframe operation
Upvotes: 0
Views: 6885
Reputation: 25
You Can use this UDF Function
from pyspark.sql.types import FloatType
tofloatfunc = udf(lambda x: x,FloatType())
changedTypedf = df.withColumn("Column_name", df["Column_name"].cast(FloatType()))
Upvotes: 0
Reputation: 4099
Simply you can do any of these -
Option1:
df1 = df.select('*',df.area.cast("string"))
select
- All the columns you want in df1 should be mentioned in select
Option2:
df1 = df.selectExpr("*","cast(area as string) AS new_area")
selectExpr
- All the columns you want in df1 should be mentioned in selectExpr
Option3:
df1 = df.withColumn("new_area", df.area.cast("string"))
withColumn
will add new column (additional to existing columns of df)
"*" in select
and selectExpr
represent all the columns.
Upvotes: 1
Reputation: 1464
use withColumn function to change the data type or values in the field in spark e.g. is show below:
import pyspark.sql.functions as F
df = df.withColumn("area",F.col("area").cast("string"))
Upvotes: 1