Reputation: 19100
I just wrote my first JSON schemas, great! However, I am now looking at how to structure multiple JSON schemas, both during development and later how to host them. There seems to be very limited guidance on this.
Of course, I looked up the official json-schema.org documentation on how to structure a complex scheme, and even got something to work. I now have three schemas, organized during development in a folder structure as follows:
json-schemas
- /common
- /common-schemas.json
- /data
- /data-schemas.json
- /DataStreamServiceRequest
- /DataStreamServiceRequest-schema.json
Out of these three, only DataStreamServiceRequest-schema.json
contains a single schema (it is a schema for all possible requests to an application service endpoint). It refers to types defined in data-schemas.json
and common-schemas.json
by using relative references. The intent here was to have all types for a single subsystem (e.g., data
) available in one file.
I assigned all three .json
files with an $id
containing the absolute URI, corresponding to the directory they are in.
For example, here is common-schemas.json
:
{
"$schema": "https://json-schema.org/draft/2019-09/schema",
"$id": "https://carp.cachet.dk/schemas/common",
"NamespacedId": {
"type": "string",
"pattern": "^([a-z_0-9]+\\.?)+[a-z_0-9]$"
}
}
And, from data-schemas.json
, I refer to NamespacedId
in common-schemas.json
using:
"dataType": { "$ref": "common#/NamespacedId" }
I'm glad this works, but, is this recommended? Am I overlooking something fundamental structuring my schemas as follows? Is there any particular reason to prefer "one schema per type"? Is there an idiomatic structure I am overlooking, e.g., an equivalent to Java's file location corresponding to namespace?
If there isn't (I would argue in Java there is) that would also constitute an answer.
Perhaps additional relevant context: I'm using networknt/json-schema-validator, and for development/testing purposes, it is very convenient to not have too many absolute URIs. I need to map each URI to a local file when initializing the validator.
Upvotes: 4
Views: 3746
Reputation: 24399
Like @Ether said, there's no one right answer, but here are the guidelines I use.
The most important guide post is to treat your schemas like you would any other code in your system. Generally, that means each "thing" that you are describing ("Person", "Product", "Address", etc) should have it's own schema.
Definitions ($defs
) should only be referenced ($ref
) from within the same schema. Use definitions to improve readability or reduce duplication within a schema. Any reference to an external schema should not have a JSON Pointer URI fragment (e.g., the #/$defs/NamespacedId
fragment you use). If you need to reference a definition in an external schema, it's probably a sign it should be in it's own schema.
An exception to this might be if you have a bunch of tiny schemas you want to put in a "common.schema.json" schema. In this case, all of your schemas should be defined as definitions, but have an anchor $anchor
. It needs to be a definition so you can use standard tools to verify that your schemas are valid. Using an anchor allows you to reference it easier, but more importantly, it's a signal that some external schema might be depending on it. Definitions without an anchor are effectively private to the schema and definitions with an anchor are effectively public. You can freely refactor unanchored definitions, but you have to be wary of breaking other schemas if you refactor anchored definitions.
Don't use $id
s. When a schema doesn't have an $id
, it's identified by the URI that was used to retrieve the schema. That can be a file URI (file:///path/to/schemas/person.schema.json
). Then all of your references can be relative meaning there are no absolute URIs in your schemas and you don't need configuration to map https
URIs to file locations. Unfortunately, not all implementations support file based retrieval or even retrieval URI identification, so this isn't always possible.
Organize your schemas however makes sense. Follow the same instincts and guidelines you would for any other code. If you have to use $id
s, make sure the path matches the path on the file system (Ex https://example.com/schemas/person
=> file:///local/path/to/app/schemas/person.schema.json
). This just makes it easier to find schemas you're looking for.
Upvotes: 6
Reputation: 53966
Like code organization, there is no one right answer. If you are going to reuse a particular schema in a number of different environments, putting it in a separate file can make sense. But if you're only going to use it in one place, it may be more convenient to put it under a $defs
in the same file. You could even give that schema a nicer name to reference it, via $anchor
.
Upvotes: 1