Fengyu
Fengyu

Reputation: 35

How to load JSON(path saved in csv) with Spark?

I am new to Spark. I can load the .json file in Spark. What if there are thousands of .json files in a folder. picture of .json files in the folder

And I have a csv file, which classifies the .json files with labels.picture of csv file

What should I do with Spark if I want to load and save the data.(for example.I want to load the first information in csv, but it is text information. But it gives the path of .json, and I want to load the .json, then save the output. So I will know the first Trusted label graph's json information.)

Upvotes: 2

Views: 719

Answers (1)

Neel Tiwari
Neel Tiwari

Reputation: 79

For the JSON:

jsonRDD = sql_context.read.json("path/to/json_folder/");

For CSV install spark-csv from here Databricks' spark-csv

csvRDD = sql_context.read.load("path/to/csv_folder/",format='com.databricks.spark.csv',header='true',inferSchema='true')

Upvotes: 1

Related Questions