Reputation: 1204
In my app I often call an external api that returns an json string.
$url = 'api.example.com/xyz';
$blah = json_decode( file_get_contents( $url ) );
But in some cases I get
PHP Fatal error: Allowed memory size of xxx bytes exhausted (tried to allocate 32 bytes) in ...
I cannot control the external API, and of course I could increase the memory for php, but that has some drawbacks.
1- Whatever size I set, could still be too little. 2- If I set the memory size to 'infinite' then I could run the risk of killing my server.
Ideally I would like to 'check' before I call json_decode( ... ) that the string result into a memory exhaustion.
Is that possible?
Upvotes: 11
Views: 25953
Reputation: 3422
Rather than simply quit if the JSON file is too large, you can process arbitrary size JSON files by using an event-based JSON parser like https://github.com/salsify/jsonstreamingparser. Only a small chunk of the object/array will be loaded into memory at a time.
If you have any influence over the JSON file, request or change it to be reformatted in JSON Lines format so it can be processed line-by-line with any ordinary JSON parser.
Upvotes: 7
Reputation: 778
If the only thing you need is to iterate over items in a json of unpredictable size, try halaxa/json-machine. It will never run out of memory when parsing json of any size, and uses just foreach
to do that, no rocket science. No need to check size "safety" beforehand nor increasing php memory limit. It works like this:
<?php
foreach(JsonMachine::fromFile('users.json') as $user) {
echo $user['name'];
}
Upvotes: 0
Reputation: 4889
My first answer above is purely about avoiding the memory limit. Now how can you deal with the data if you hate to discard some, but if it keeps being occasionally bulky beyond your memory limit?
Presuming that you don't need to have the response parsed in one shot and absolute real time. Then you could simply split the response into suitably sized chunks, for example with explode()
or preg_split()
, and save them into a temporary directory, and process later in a batch operation.
I presume the large API responses return multiple data-sets at once; if not, you could also splice a single multi-dimensional entry into more manageable chunks that are later rejoined, although that would require much more surgical precision into crafting your JSON-string splitter function.
If the multiple data-sets need to be associated in later processing (such as database-entry), you would also want to have an aggregator file containing the metadata for the batch op. (Or otherwise stick it all into a database.) You would of course have to ensure that the chunked data is well-formed. It's not ideal, but not having gigs of memory isn't ideal either. Batching is one way of dealing with it.
Upvotes: 2
Reputation: 4889
You must be getting some massive JSON responses if they manage to exhaust your server's memory. Here are some metrics with a 1 MB file containing a multidimensional associated array (containing data prepared for entry into three MySQL tables with diverse data-types).
When I include
and the file is loaded into memory as an array, my memory usage goes to 9 MB. If I get the raw data with file_get_contents()
, it takes 1 MB memory as expected. Then, a PHP array has an approximate ratio of 1:9 to the strlen()
of the data (originally output with var_export()
).
When I run json_encode()
, peak memory usage doesn't increase. (PHP allocates memory in blocks so there's often a bit of overhead, in this case enough to include the string data of the JSON; but it could bump you up one block more.) The resulting JSON data as a string takes 670 KB.
When I load the JSON data with file_get_contents
into a string, it takes an expected 0.75 MB of memory. When I run json_decode()
on it, it takes 7 MB of memory. I would then factor a minimum ratio of 1:10 for JSON-data-bytesize decoded to native PHP array-or-object for RAM requirement.
To run a test on your JSON data before decoding it, you could then do something like this:
if (strlen($my_json) * 10 > ($my_mb_memory * 1024 * 1024)) {
die ('Decoding this would exhaust the server memory. Sorry!');
}
...where $my_json
is the raw JSON response, and $my_mb_memory
is your allocated RAM that's converted into bytes for comparison with the incoming data. (You can of course also use intval(ini_get('memory_limit'))
to get your memory limit as an integer.)
As pointed out below, the RAM usage will also depend on your data structure. For contrast, a few more quick test cases because I'm curious myself:
So your actual RAM mileage may vary a good deal. Also be aware that if you pass that bulk of data around in circles and do a bit of this and that, your memory usage may get much (or exponentially, depending on your code economy) higher than what json_decode()
alone will cause.
To debug memory usage, you can use memory_get_usage()
and/or memory_get_peak_usage()
at major intervals in your code to log or output the memory used in different parts of your code.
Upvotes: 11