user1403789
user1403789

Reputation: 87

Fetch first row as Headers from one csv and values from other csv

I have two csv's. First csv has only 1 row which is headers. 2nd csv has values. I want to create the dataframe which has headers from row1 from csv1 and values from all rows within csv 2. Both the csv's has same number of fields starting from _c0 till _c1000 (has about 1000 columns). Columns types can be different within each csv but column names and number of columns will be same. Below is the example snip. I am using databricks (pyspark). Any help is appreciated.

enter image description here

Upvotes: 0

Views: 156

Answers (1)

ARCrow
ARCrow

Reputation: 1857

You can impose the schema resulted from reading the first file on reading the second file:

df1 = spark.read.option('header', True).csv('<path to the file with header>')
df2 = spark.read.schema(df1.schema).csv('<path to the file without header>')

Upvotes: 0

Related Questions