Assaf Mendelson
Assaf Mendelson

Reputation: 13001

Create spark dataframe schema from json schema representation

Is there a way to serialize a dataframe schema to json and deserialize it later on?

The use case is simple: I have a json configuration file which contains the schema for dataframes I need to read. I want to be able to create the default configuration from an existing schema (in a dataframe) and I want to be able to generate the relevant schema to be used later on by reading it from the json string.

Upvotes: 40

Views: 72924

Answers (3)

Rishabh Sahrawat
Rishabh Sahrawat

Reputation: 2507

Adding to the answers above, I already had a custom PySpark Schema defined as follows:

custom_schema = StructType(
        [
            StructField("ID", StringType(), True),
            StructField("Name", StringType(), True),
        ]
    )

I converted it into JSON and saved as a file as follows:

with open("custom_schema.json", "w") as f:
    json.dump(custom_schema.jsonValue(), f)

Now, you have a json file with schema defined which you can read as follows

with open("custom_schema.json") as f:
    new_schema = StructType.fromJson(json.load(f))
    print(new_schema)

Inspired from: stefanthoss

Upvotes: 2

mishkin
mishkin

Reputation: 6242

I am posting a pyspark version to a question answered by Assaf:

from pyspark.sql.types import StructType    

# Save schema from the original DataFrame into json:
schema_json = df.schema.json()

# Restore schema from json:
import json
new_schema = StructType.fromJson(json.loads(schema_json))

Upvotes: 79

Assaf Mendelson
Assaf Mendelson

Reputation: 13001

There are two steps for this: Creating the json from an existing dataframe and creating the schema from the previously saved json string.

Creating the string from an existing dataframe

    val schema = df.schema
    val jsonString = schema.json

create a schema from json

    import org.apache.spark.sql.types.{DataType, StructType}
    val newSchema = DataType.fromJson(jsonString).asInstanceOf[StructType]

Upvotes: 79

Related Questions