SecY
SecY

Reputation: 357

How to rename a column for a dataframe in pyspark?

below is part code:

df = None

F_DATE = ['202101', '202102', '202103']

for date in F_DATE:
    if df is None:
        df = spark.sql("select count(*) as Total_count from test_" + date)
    else:
        df2 = spark.sql("select count(*) as Total_count from test_" + date)
        df = df.union(df2)

df.write.csv('/csvs/test.csv')

I tried 'toDF()', 'withColumnRenamed()', and 'selectExpr()', but the column name was not changed.

NOTE. Use the table in Hive.

ADD I've never used "df.show()" to write code, and I've used "df.show()" to read code. When used "df.show()" in write code, it was confirmed that the column name came out well, and when used "df.show()" in read code, it was confirmed that the column name did not come out properly.

Upvotes: 4

Views: 5204

Answers (1)

lucaspompeun
lucaspompeun

Reputation: 308

You can use:

df = df.withColumnRenamed('old_name', 'new_name')

Upvotes: 4

Related Questions