Reputation: 279
I am new to python. I am trying to read a JSON file that contains my schema definition. It looks like :
{
"type" : "struct",
"fields" : [ {
"name" : "name",
"type" : "string",
"nullable" : true,
"metadata" : { }
}, {
"name" : "address",
"type" : "string",
"nullable" : true,
"metadata" : { }
}, {
"name" : "comment",
"type" : "string",
"nullable" : true,
"metadata" : { }
}
}
I have a data set, and on that i need to apply above json schema, I have tried below code :
targetDf = spark.createDataFrame(inputDf.rdd, schemaFieldsOne)
However, here I need to specify the 'schemaFieldsOne' a struct type, I want to read the JSON and convert it into Python struct type so that I can apply that StructType to my data frame(.to add).
Upvotes: 0
Views: 2043
Reputation: 2767
try this
import pyspark.sql.types as T
import pyspark.sql.functions as F
with open('./schema.txt', 'r') as S: # path to your schema file
saved_schema = json.load(S)
schema = T.StructType.fromJson(json.loads(saved_schema))
df = spark.createDataFrame(yourRdd, schema)
Upvotes: 1