Data frame showing _c0,_c1 instead my original column names in first row. i want to show My column name which is on first row of my CSV. dff = spark.read.csv("abfss://dir@acname.dfs.core.windows.net/ diabetes.csv") dff:pyspark.sql.dataframe.DataFrame _c0:string _c1:string _c2:string _c3:string _c4:string _c5:string _c6:string _c7:string _c8:string

Very simple solution is to have a header=True while you read the file: dff = spark.read.csv("abfss://dir@acname.dfs.core.windows.net/diabetes.csv", header=True)

pysparkapache-spark-sqlazure-databricksspark-notebook

Data_Insight

Reputation: 585

How to show my existing column name instead '_c0', '_c1', '_c2', '_c3', '_c4' in first row?

Data frame showing _c0,_c1 instead my original column names in first row.
i want to show My column name which is on first row of my CSV.

    dff = 
    spark.read.csv("abfss://[email protected]/
    diabetes.csv")
    dff:pyspark.sql.dataframe.DataFrame
    _c0:string
    _c1:string
    _c2:string
    _c3:string
    _c4:string
    _c5:string
    _c6:string
    _c7:string
    _c8:string

Upvotes: 6

Answers (3)

Aman Sehgal

Reputation: 556

Set header as true while loading the CSV file.

spark.read.format("csv")
                   .option("delimiter", ",")
                   .option("header", "true")
                   .option("inferSchema", "true")
                   .load("file.csv")

Upvotes: 2

Data_Insight

Reputation: 585

I Just Sorted By below code

    .select(col("_c0").alias("A"),
             col("_c1").alias("B"),
             col("_c2").alias("C"),
             col("_c3").alias("D"),
             col("_c4").alias("E")

            )

Upvotes: -1

Kishan Vyas

Reputation: 136

Very simple solution is to have a header=True while you read the file:

dff = spark.read.csv("abfss://[email protected]/diabetes.csv", header=True)

Upvotes: 10

How to show my existing column name instead &#39;_c0&#39;, &#39;_c1&#39;, &#39;_c2&#39;, &#39;_c3&#39;, &#39;_c4&#39; in first row?

Answers (3)

Related Questions

How to show my existing column name instead '_c0', '_c1', '_c2', '_c3', '_c4' in first row?