Reputation: 87
I hope I phrased the question right, I Have been on this for quite sometime now looking for different approaches, I have an app that polls users for prices they think are right for a commodity then from there the app is to come up with an average price based on the most common
to near common prices
and averaging them issue is coming up with a query that can find those prices, so a sample of my table:
|id |commodityFk|price |dateCreated |
--------------------------------------------
|1 |1 |1200 |2016-12-24 22:30:30|
|2 |1 |500000|2016-12-24 22:30:30|
|3 |1 |500000|2016-12-24 22:30:30|
|4 |1 |450000|2016-12-24 22:30:30|
|5 |1 |506980|2016-12-24 23:15:12|
|6 |1 |2000 |2016-12-25 23:57:06|
so from this table the most common price
is 500000
but we also have prices like 506980
and 450000
which are near the common price
so we expect it to average 500000, 500000, 450000 and 506980
personally I am not conversant with MySQL any help solving this will be appreciated.
Upvotes: 0
Views: 43
Reputation: 2362
Are you looking for an entirely MySQL based solution? As mentioned in a comment, you should really define "near" more explicitly. In the below example, I've called any value within 1 Std Deviation of the average a "near" value.
Additionally, what do you do if there is more than 1 most common price? Without knowing the specifics of your requirements I might suggest taking an approach that bypasses that problem entirely by using the average as a starting point rather than the mode. Alternatively, you could use a COALESCE()
function to attempt to get a mode and then use the average if that fails.
Here is an example of something that will output a value based on the clustering of values and avoids having to deal with weird mode related edge cases.
SELECT AVG(price) FROM prices
JOIN (SELECT AVG(price) as rawAverage, STD(price) as deviation FROM prices) stats
WHERE commodityFk = 1
AND price BETWEEN
(rawAverage - deviation) AND (rawAverage + deviation);
This is obviously just a starting point but it is fairly scalable. You could easily change the expression in the JOIN
clause to change how the boundaries of "near" values are defined.
Upvotes: 2
Reputation: 14750
You might find it helpful to return the full price list from SQL and create a PHP function that analyzes the list of prices to determine the "prices near the common price".
This would make it easy to tweak the criteria.
Perhaps something like this would be a start for you:
function findPricesNearCommonPrice($data)
{
$pricesNearCommonPrice = Array();
// find most common price
$countOfEachValue = array_count_values($data);
$mostCommonPrice = array_search(max($countOfEachValue), $countOfEachValue); // doesn't account for items that occur the same number of times, but you could make it do that :-)
echo "Most Common Price: " . $mostCommonPrice . "<br><br>";
$tolerance = .15; // 15%
$minNearPrice = $mostCommonPrice * (1 - $tolerance);
$maxNearPrice = $mostCommonPrice * (1 + $tolerance);
foreach ($data as $p) {
if ($p > $minNearPrice && $p < $maxNearPrice) {
$pricesNearCommonPrice[] = $p;
}
}
return $pricesNearCommonPrice;
}
Then if you do:
$data = Array(500000, 500000, 450000, 506980, 2000);
$values = findPricesNearCommonPrice($data);
$average = array_sum($values) / count($values);
echo "Prices near the most common price:<br>";
echo implode(", ",$values);
echo "<br><br>";
echo "Average: " . $average;
You get:
Most Common Price: 500000
Prices near the most common price:
500000, 500000, 450000, 506980
Average: 489245
Of course you'll need to modify it to fit your exact needs and format of your data, but hopefully it's a start.
Upvotes: 1