Reputation: 21
I have a programming problem that I need help solving, and that I'm hoping I can get some assistance with here. Basically, I need to find a way to merge array data that's being returned from 2 different RESTful APIs, sort it, and then paginate through it.
Note that I'm dealing with MLS data (ie: Real Estate listings), and that I'm using PHP for this project (exclusively). Also, these are the 2 different APIs that I'm using:
Spark API https://sparkplatform.com/docs/overview/api
Bridge RESO Web API https://bridgedataoutput.com/docs/platform/
The problem that I'm having, specifically, is that each of these APIs have different limits as to how many records can be retrieved per request. For example, the Spark API allows me to retrieve as many as 1000 records at a time, whereas the Bridge API only allows me to retrieve no more than 200. I understand why these limits are in place, and it really hasn't been a problem until now. I say this because I've been asked to try to find a way to retrieve Real Estate listings from both of these APIs, to then merge all of the returned data from both of them into a single array, .. and to then sort them based on list price (from highest to lowest price), .. and then paginate through them (50 listings per page).
This wouldn't be a problem if I was dealing with just one of the 2 different API, as both of them have features that make it quite easy to both sort and paginate through the data. And if I was working with queries that retrieve only small amounts of data (ie: less than 50 records total from both APIs combined), I've already implemented working methods that allow for me to merge the 2 different data sets and then sort them based on list price:
$mlsdata = array_merge($mlsdatamiami,$mlsdataftlauderdale);
function price_compare($a, $b) {
$t2 = $a['StandardFields']['ListPrice'];
$t1 = $b['StandardFields']['ListPrice'];
return $t1 - $t2;
}
usort($mlsdata, 'price_compare');
However, I am unfortunately NOT dealing with small data sets, and could potentially be returning as many as tens of thousands of listings from both APIs combined.
Although I've succeeded at writing some code that allows for me to paginate through my new "merged" data set, this obviously only works when I'm dealing with small amounts of data (less than 1200 records).
$finalarray = array_slice($newarray,$startposition,$perpagelimit);
foreach($finalarray as $item) {
...
}
But again, I'm not dealing with result sets of less than 1200 records. So although it might appear that I'm displaying listings from highest price to lowest price on page #1 of a paginated result set, everything starts to fall apart on page #2, where the list prices are suddenly all over the place.
I've tried running multiple different queries in a for loop, pushing the data to a new array and then merging it with the initial result set ..
$miamimlsdataraw = file_get_contents($apiurl);
$miamimlsdata = json_decode($miamimlsdataraw, true);
$number_of_miami_listings = $miamimlsdata['@odata.count'];
$miamilistingsarray = array();
if ($miamimlsdata['@odata.count'] > 200) {
$number_of_miami_queries = floor($number_of_miami_listings / 200);
$miami_listings_start_number = 200;
for ($x = 1; $x <= $number_of_miami_queries; $x++) {
$paramsextra = $params . "&\$skip=" . $miami_listings_start_number * $x;
$apiurl = $baseurl . '/' . $dataset . '/' . $endpoint . '?access_token=' . $accesstoken . $paramsextra;
$miamimlsdataraw = file_get_contents($apiurl);
$miamimlsdata_extra = json_decode($miamimlsdataraw, true);
array_push($miamilistingsarray,$miamimlsdata_extra);
}
}
$miamimlsdata = array_merge($miamilistingsarray,$miamimlsdata);
With this particular experiment, I was only dealing with about 2,700 listings (from only 1 of the APIs) .. and the performance was horrendous. And when I tried writing all of the returned data to a text file on the server (rather then trying to display it in the page), it came in at a whopping 25mb in size. Needless to say, I don't think that I can reliably use this approach at all.
I've considered perhaps setting this up as a cronjob, storing the array data in our database (the site is WordPress based), and then retrieving and paginating through it at runtime .. rather than querying the APIs in realtime. But I now strongly suspect that this would be just as inefficient.
So .. I realize that this question was rather long winded, but I honestly didn't know where else to turn. Is what I'm trying to do simply not possible? Or am I perhaps missing something obvious? I welcome all and any suggestions.
-- Yvan
Upvotes: 0
Views: 567
Reputation: 1
There is no need to merge and sort all the listings of both MLS. Since you only need 50 listings for each page and both API's are using RESO, you can have the API return only sorted results that you need. For example, to get listings for page 1, you only need:
https://api.bridgedataoutput.com/api/v2/OData/Property?$orderby=ListPrice desc&$top=50
By looping through both arrays simultaneously in a while loop, you can pick out and stop as soon as you get 50 of the top highest price listings from both arrays.
Upvotes: 0