Reputation: 844
I have a bunch of JSON files, thousands of different schemas. Using GenSON
(the Python JSON schema generator), I managed to create schema files for each of the input files. Now, what I'd like to do is standardize all these different files to one defined schema. Here's an example:
Input
{
"name": "Bob Odenkirk",
"title": "Software Engineer",
"location": {
"locality": "San Francisco",
"region": "CA",
"country": "United States"
},
"age": 62,
"status": "Active"
}
Output
{
"names": ["Bob Odenkirk"],
"occupations": ["Software Engineer"],
"locations": ["San Francisco, CA"]
}
Essentially, I am looking for a language agnostic method (i.e., I don't care what programming language is used) of defining how an input JSON file should be parsed to an output JSON file.
Upvotes: 7
Views: 12120
Reputation: 2787
I think the best, fastest, easiest way to parse many JSON files together is using python.
I was doing something similar to your project and ran into the same problem.
I found this site which teaches how to use python to actually parse JSON files together. Turns out there is a library on python called json(use pip to download json dependencies) which enables JSON file processing. If you already have a python editor, This method would be easier and faster then using Jolt
Check This website for more info: https://code.tutsplus.com/tutorials/how-to-work-with-json-data-using-python--cms-25758.
You can also use JS, which is again faster than Jolt. this is the website: https://learn.microsoft.com/en-us/scripting/javascript/reference/json-parse-function-javascript . It is very easy as you can use JSON.parse()
function
Upvotes: 0
Reputation: 4586
Jolt Spec
[
// First build the "city, state" string for location
{
"operation": "modify-default-beta",
"spec": {
"location": {
"locConcat": "=concat(@(1,locality),', ',@(1,region))"
}
}
},
// Then map the fields as needed to positions in an output json
{
"operation": "shift",
"spec": {
"name": "name[0]",
"title": "occupations[0]",
"location": {
"locConcat": "locations[0]"
}
}
}
]
Upvotes: 2
Reputation: 564
I am not sure is your expecting like below. Long time back I have created flat object and output format object. It will return output format object with data filled.
var input = {
"name": "Bob Odenkirk",
"title": "Software Engineer",
"location": {
"locality": "San Francisco",
"region": "CA",
"country": "United States"
},
"age": 62,
"status": "Active"
};
var outputFormat = {
"name": "name",
"occupations": "title",
"locations": "location.locality, location.region"
};
var flatInput = {};
function generateFlatInput(input, parent){
for (var prop in input) {
if(input.hasOwnProperty(prop) && typeof input[prop] === 'object')
flatInput = generateFlatInput(input[prop], parent + prop + '.');
else
flatInput[parent + prop] = input[prop];
}
return flatInput;
}
function generateOutput(input, outputFormat, delimiter){
input = generateFlatInput(input, '');
for (var prop in outputFormat) {
var fields = outputFormat[prop].split(delimiter);
var fieldValue = [];
for(i = 0; i < fields.length; i++){
if(!input.hasOwnProperty(fields[i].trim())) continue;
fieldValue.push(input[fields[i].trim()]);
}
outputFormat[prop] = fieldValue.join(delimiter);
}
return outputFormat;
}
console.log(generateOutput(input, outputFormat, ', '));
https://jsfiddle.net/u2yyuguk/1/
Upvotes: 1
Reputation: 3895
The url https://github.com/bazaarvoice/jolt#jolt says that Jolt may be what you're looking for.
Jolt
JSON to JSON transformation library written in Java where the "specification" for the transform is itself a JSON document.
Useful For
Transforming JSON data from ElasticSearch, MongoDb, Cassandra, etc before sending it off to the world
Extracting data from a large JSON documents for your own consumption
Upvotes: 6