egg
egg

Reputation: 231

spark - how to add the field name ,when spark reads the csv where is no head

I would like to read a CSV in spark. So I use the command in java.

result = sparkSession.read().csv("hdfs://master:9000/1.csv");

it works.Buts the result is just like :

_c0  _c1   _c2
1     egg    T
2     bob    F
3     tom    D

But the file (1.csv) have no head, the result'head is useless。

I want to the result like this:

ID  Name   Class
1     egg   T
2     bob   F
3     tom   D

How can I do for this?

thank you for everyone.

Upvotes: 3

Views: 2596

Answers (2)

Mariusz
Mariusz

Reputation: 13936

You can use toDF() method to rename all columns: https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/Dataset.html#toDF(java.lang.String...)

For example:

result = sparkSession.read().csv("hdfs://master:9000/1.csv").toDF("ID", "Name", "Class")

Upvotes: 6

Assaf Mendelson
Assaf Mendelson

Reputation: 13001

You can rename the columns:

result.withColumnRenamed("_c0", "id").withColumnRenamed("_c1", "name").withColumnRenamed("_c2", "class")

of course, if the csv has a header you can simply do:

result = sparkSession.read().option("header", "true").csv("hdfs://master:9000/1.csv");

Upvotes: 1

Related Questions