Reputation: 33900
What would be a simple way to extract integers from long JSON string?
It is very complex, and long (few KB), scanning it with the eyes seems impossibe. Just extracting digits is also not good, because so many floats that I need to skip. I know I can convert JSON to object and enumerate, but this seems like overkill, because it may containt objects, arrays, arrays inside objects etc.
Would be nice to filter some integers, for example I need numbers between 2000 and 20000. I need this to analyze the intermediate state of a running program. Not only my code is writing this data, so I can only scan the existing structure.
Upvotes: 0
Views: 3709
Reputation: 882
Well. I assume that your JSON string to JSON object parsing is faster.
I am also assuming that you are passing the JSON object to the below function iterateAndExtractInt
:
var out = new Array();
var idx = 0;
var isInt = function(input) {
return typeof input != 'number' ? false :
(!isNaN(parseInt(input))
&& (parseFloat(input) == parseInt(input)));
}
var isArray = function(input) {
return Object.prototype.toString.call(input) === '[object Array]';
}
var isObject = function(input) {
return Object.prototype.toString.call(input) === '[object Object]';
}
var iterateAndExtractInt = function(obj) {
return (function doJob(obj) {
var process = function(value) {
if(isInt(value)) { // Add your custom validation here to allow select values
out[idx++] = value;
}
if(isArray(value)) {
value.forEach(function(entry) {
process(entry);
})
}
if(isObject(value)) {
doJob(value);
}
};
for(var key in obj) {
process(obj[key]);
}
return out;
})(obj);
}
// our test input
var inp = {a: 1, b: 1.001, c: {a: 'str', b: '33'},
d: [2,3,4], e: [{a:[5], b:{a:[{s:6},
{c:[[[[[[7,8,9,{a:[10]}]]]]],11]}]}}]};
console.log(iterateAndExtractInt(inp));
P.S: Performance yet to be tested! I am not sure how your input string looks like and how big is it.
Upvotes: 0
Reputation: 11
It's not easy to give an answer without first seeing an example of the data.
There are two methods which could have given you the format of the JSON string, being the "fixed column format", or "comma separated" (actually the "separation" could be any recurring character to identify data separation).
In both instances, if you know the creator of the data for the JSON, you can ask the creator which field represents which piece of data within the JSON string. At least you will then know what each part of the string the data represents.
Otherwise, the hard work starts looking through the string, to find common features that define the numbers that you seek, such as seperators, being comma's, semi-colons etc. or any other character. If these do not exist, you will need to find patterns within the JSON string, and then build your extraction around that.
An example using jquery for comma separated (or any other common character), is;
var numValue = jsonResults.split(",");
numValue.each(function (index, value) {
if (value > 1999 && value < 19999) {
//Do whatever you want with the data
}
}
Fixed column format is tough, because you will need to count the row and column position where the data starts and ends within the string, and build you modelling around that. Copying and pasting the contents into a text file and looking through it would be the beginning of that exercise.
Good luck.
Upvotes: 1