Reputation: 35
I am new to Spark. I can load the .json file in Spark. What if there are thousands of .json files in a folder. picture of .json files in the folder
And I have a csv file, which classifies the .json files with labels.picture of csv file
What should I do with Spark if I want to load and save the data.(for example.I want to load the first information in csv, but it is text information. But it gives the path of .json, and I want to load the .json, then save the output. So I will know the first Trusted label graph's json information.)
Upvotes: 2
Views: 719
Reputation: 79
For the JSON:
jsonRDD = sql_context.read.json("path/to/json_folder/");
For CSV install spark-csv
from here Databricks' spark-csv
csvRDD = sql_context.read.load("path/to/csv_folder/",format='com.databricks.spark.csv',header='true',inferSchema='true')
Upvotes: 1