CloudSurf
CloudSurf

Reputation: 21

How to combine, paginate, and sort an external API with MySQL result set

I am looking for an efficient way to combine results from a REST API with results from a query to a hosted MySQL database. I need the ability to paginate and sort through the results, but am having trouble determining the best approach such that the order of records is maintained. I am using PHP and Curl to send a rest request to the service.

The external API I am using is eventful (http://api.eventful.com/), and my MySQL database stores basically the same event information. I'd like to pull from both of these sources, sort, and enable pagination.

i.e in this example, we have a LIMIT of 4, but some records may come from eventful and others from MySQL depending on the date. I would think you need to keep an offset of both the API and MySQL results, and decrement them by the number of records that are used in each page. However, I have no idea how to sort unless I pull all the records from both sources.

event name / date / source (E = eventful, M = MySQL)

| page_num 1  | page_num 2  | page_num 3   |
|:-----------:|------------:|:------------:|
|  A|7/27|E   |  E|7/31|E   |  I|8/04|M    |
|  B|7/28|M   |  F|8/01|M   |  J|8/05|M    |
|  C|7/29|E   |  G|8/02|M   |  K|8/06|E    |
|  D|7/30|M   |  H|8/03|E   |  L|8/07|E    |

Upvotes: 2

Views: 2885

Answers (2)

Alix Axel
Alix Axel

Reputation: 154643

If storing the data on your DB is not an option you have to merge and sort within PHP.

Here is an example (CodePad), using two unordered arrays - each one describing your data sources:

$mysql = array
(
    array('name' => 'I', 'date' => '20110804'),
    array('name' => 'J', 'date' => '20110805'),
    array('name' => 'F', 'date' => '20110801'),
    array('name' => 'D', 'date' => '20110730'),
    array('name' => 'B', 'date' => '20110728'),
    array('name' => 'G', 'date' => '20110802'),
);

$eventful = array
(
    array('name' => 'L', 'date' => '20110807'),
    array('name' => 'A', 'date' => '20110727'),
    array('name' => 'E', 'date' => '20110731'),
    array('name' => 'H', 'date' => '20110803'),
    array('name' => 'K', 'date' => '20110806'),
    array('name' => 'C', 'date' => '20110729'),
);

You'll want to select an equal amount of maximum events to display on each data source, and merge both arrays into a single multi-dimensional array, which we will then sort using array_multisort():

$dates = array();
$result = array_merge($mysql, $eventful);

foreach ($result as $key => $value)
{
    $dates[$key] = $value['date'];
}

array_multisort($dates, SORT_ASC, $result);

The $result variable will then be sorted in ascending order:

Array
(
    [0] => Array
        (
            [name] => A
            [date] => 20110727
        )

    [1] => Array
        (
            [name] => B
            [date] => 20110728
        )

    [2] => Array
        (
            [name] => C
            [date] => 20110729
        )

    [3] => Array
        (
            [name] => D
            [date] => 20110730
        )

    [4] => Array
        (
            [name] => E
            [date] => 20110731
        )

    [5] => Array
        (
            [name] => F
            [date] => 20110801
        )

    [6] => Array
        (
            [name] => G
            [date] => 20110802
        )

    [7] => Array
        (
            [name] => H
            [date] => 20110803
        )

    [8] => Array
        (
            [name] => I
            [date] => 20110804
        )

    [9] => Array
        (
            [name] => J
            [date] => 20110805
        )

    [10] => Array
        (
            [name] => K
            [date] => 20110806
        )

    [11] => Array
        (
            [name] => L
            [date] => 20110807
        )
)

If you need descending order you can use the SORT_DESC constant instead.

To paginate this array you can just use array_chunk():

$result = array_chunk($result, 4); // 4 events per "page"

Which, in turn, produces an array with an added dimension:

Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [name] => A
                    [date] => 20110727
                )

            [1] => Array
                (
                    [name] => B
                    [date] => 20110728
                )

            [2] => Array
                (
                    [name] => C
                    [date] => 20110729
                )

            [3] => Array
                (
                    [name] => D
                    [date] => 20110730
                )
        )

    [1] => Array
        (
            [0] => Array
                (
                    [name] => E
                    [date] => 20110731
                )

            [1] => Array
                (
                    [name] => F
                    [date] => 20110801
                )

            [2] => Array
                (
                    [name] => G
                    [date] => 20110802
                )

            [3] => Array
                (
                    [name] => H
                    [date] => 20110803
                )
        )

    [2] => Array
        (
            [0] => Array
                (
                    [name] => I
                    [date] => 20110804
                )

            [1] => Array
                (
                    [name] => J
                    [date] => 20110805
                )

            [2] => Array
                (
                    [name] => K
                    [date] => 20110806
                )

            [3] => Array
                (
                    [name] => L
                    [date] => 20110807
                )
        )
)

Bare in mind that this is a lot less efficient than the typical database sort and limit, mostly because you have the load the whole data into memory, even those that you don't need for the active "page".

I'm not sure what amount of data you're expecting to process but array_multisort() performs reasonably well up to a couple thousand records (I've actually never used it for anything larger), if you're looking for further optimizations you can destroy the unneeded indexes of the chunked array or, use array_slice() and some arithmetic to fetch the four specific records that belong to a certain page.

Nonetheless, if you can optimize your selects (maybe with some business / date logic, not sure) you will probably make the whole sorting + chunk (or slice) process more CPU and memory friendly.

Upvotes: 1

marcin_koss
marcin_koss

Reputation: 5882

How about pulling data from both sources, storing all in the database and then query that database using some php to paginate results.

Upvotes: 1

Related Questions