Iterator516
Iterator516

Reputation: 287

Import CSV file as PySpark Dataset (NOT Dataframes)

How can I import CSV file into PySpark as a dataset? Note that I am NOT asking about how to import them into dataframes.

While reading this page from Databricks, I learned some benefits of datasets over dataframes.

https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html

I want to learn how to work with them instead of RDDs and dataframes.

Upvotes: 0

Views: 530

Answers (1)

cronoik
cronoik

Reputation: 19365

The linked blog post gives you the answer that it is impossible because of the python:

Note: Since Python and R have no compile-time type-safety, we only have untyped APIs, namely DataFrames.

Upvotes: 4

Related Questions