Scott
Scott

Reputation: 5379

Bayesian algorithm returning 0

I'm trying to get the top rated photos within the last week through MySQL and PHP. I've found that the Bayesian formula may be what I need, but I've been messing with it to no avail.

The following code doesn't return any errors, it only returns a single '0'. Why that is I haven't the slightest.

$bayesian_algo = "SELECT
            photo_id,
            (SELECT count(photo_id) FROM photo_ratings) / 
(SELECT count(DISTINCT photo_id) FROM photo_ratings) AS avg_num_votes,
            (SELECT avg(rating) FROM photo_ratings) AS avg_rating,
            count(photo_id) as this_num_votes,
            avg(rating) as this_rating
            FROM photo_ratings
            WHERE `date` > '$timeframe'
            GROUP BY photo_id";

$bayesian_info = $mysqli->query($bayesian_algo);

$all_bayesian_info = array();
while($row=$bayesian_info->fetch_assoc()) array_push($all_bayesian_info,$row);

list($photo_id,$avg_num_votes,$avg_rating,$this_num_votes,$this_rating) = $all_bayesian_info;
$photo_id = intval($photo_id);
$avg_num_votes = intval($avg_num_votes);
$avg_rating = intval($avg_rating);
$this_num_votes = intval($this_num_votes);
$this_rating = intval($this_rating);

$bayesian_result = (($avg_num_votes * $avg_rating) + ($this_num_votes * $this_rating)) / ($avg_num_votes + $this_num_votes);

echo $bayesian_result;  // 0??

My database looks like this:

photo_id | user_id | rating | date

Where all fields are stored as INTs (I'm storing date as a UNIX timestamp).

I'm tired and coding recklessly, normally I could at least get a little further if there were error messages (or anything!), but there's no way the data I get if I var_dump($all_bayesian_info) would ever return 0.

Upvotes: 2

Views: 308

Answers (1)

Jirilmon
Jirilmon

Reputation: 1944

Lets do the complex Bayesian calcuation in mysql query itself!.

The code can be rewritten like this:

$bayesian_algo_result = "SELECT *, 
(((resultdata.avg_num_votes * resultdata.avg_rating) + (resultdata.this_num_votes * resultdata.this_rating)) / (resultdata.avg_num_votes + resultdata.this_num_votes)) AS bayesian_result
FROM 
(    
SELECT
    photo_id,
    (SELECT count(photo_id) FROM photo_ratings) / 
      (SELECT count(DISTINCT photo_id) FROM photo_ratings) AS avg_num_votes,
    (SELECT avg(rating) FROM photo_ratings) AS avg_rating,
    count(photo_id) as this_num_votes,
    avg(rating) as this_rating    
FROM photo_ratings
WHERE `date` > '$timeframe'
GROUP BY photo_id
) AS resultdata;
";

$bayesian_result_info = $mysqli->query($bayesian_algo_result);

//loop through the rows.
while($row = $bayesian_result_info->fetch_assoc()) {
    list(
        $photo_id,
        $avg_num_votes,
        $avg_rating,
        $this_num_votes,
        $this_rating, 
        $bayesian_result
    ) = $row;

    echo 'Balesian rating for photo' . $photo_id . ' is: ' . $bayesian_result;
}

Note:

  • Here is a working sql fiddle: http://sqlfiddle.com/#!2/d4a71/1/0

  • I didnot make any logic change to your formula. So please make sure your formula is correct.

  • If/when UNIX timestamps go to a 64bit data type, then you'll have to use a MySQL "bigint" to store them ( for your 'date' column).

Upvotes: 2

Related Questions