Fetch first row as Headers from one csv and values from other csv

Question

I have two csv's. First csv has only 1 row which is headers. 2nd csv has values. I want to create the dataframe which has headers from row1 from csv1 and values from all rows within csv 2. Both the csv's has same number of fields starting from _c0 till _c1000 (has about 1000 columns). Columns types can be different within each csv but column names and number of columns will be same. Below is the example snip. I am using databricks (pyspark). Any help is appreciated.

ARCrow · Accepted Answer

You can impose the schema resulted from reading the first file on reading the second file:

df1 = spark.read.option('header', True).csv('')
df2 = spark.read.schema(df1.schema).csv('')

Fetch first row as Headers from one csv and values from other csv

Answers (1)

Related Questions