sylvian
sylvian

Reputation: 749

postgresql percentage comparison of different rows in a separate column

I have a table in POSTGRESQL that is actually a VIEW generated from a bunch of JOINS which eventually looks like this:

 test_type |  brand  | model  | band | firmware_version | avg_throughput
-----------+---------+--------+------+------------------+----------------
 1client   | Linksys | N600   | 5ghz | 1.5              |          66.94
 1client   | Linksys | N600   | 5ghz | 2.0              |          94.98
 1client   | Linksys | N600   | 5ghz | 2.11             |         132.40
 1client   | Linksys | EA6500 | 5ghz | 1.5              |         216.46
 1client   | Linksys | EA6500 | 5ghz | 2.0              |         176.79
 1client   | Linksys | EA6500 | 5ghz | 2.11             |         191.44

What I'd like to accomplish is creating another column which will compare and show a percentage difference of throughput between different firmware versions for each model.

More specifically the query will take the throughput of the lowest firmware version and save it as a base of comparison to the throughput for all the other firmware versions.

So if we take Linksys N600, the lowest firmware version is 1.5 and the throughput is 66.94 we will save that as a baseline and compare the other throughputs to that number and show the difference in percentage.

The final result of the table would look like this:

 test_type |  brand  | model  | band | firmware_version | avg_throughput | comparison
-----------+---------+--------+------+------------------+----------------+------------
 1client   | Linksys | N600   | 5ghz | 1.5              |          66.94 | 0% (or empty)
 1client   | Linksys | N600   | 5ghz | 2.0              |          94.98 | +41.61%
 1client   | Linksys | N600   | 5ghz | 2.11             |         132.40 | +97.78%
 1client   | Linksys | EA6500 | 5ghz | 0.5              |         216.46 | 0% (or empty)
 1client   | Linksys | EA6500 | 5ghz | 1.2              |         176.79 | -18.32%
 1client   | Linksys | EA6500 | 5ghz | 2.5              |         191.44 | -11.55%

Any ideas on how to do this?

I like to keep things logical and separated and for now am not considering doing this calculation in my code, I'd rather have this done on my database and then just show the result but am open to suggestion if this doesn't make sense.

Upvotes: 2

Views: 156

Answers (3)

Bohemian
Bohemian

Reputation: 424983

Use a subquery on the view to return the base throughput from the earliest firmware versions for each model via a window function, then join your view to that:

select
  v.test_type, v.brand, v.model, v.band, v.firmware_version, v.avg_throughput,
  (100 * v.avg_throughput / b.avg_throughput)::decimal(8,2) - 100 as percent_gain
  from myview v
join (select test_type, brand, model, band,
      avg_throughput, rank() OVER (PARTITION BY test_type, brand, model, band
      order by firmware_version) as rank
      from myview) b
on v.test_type = b.test_type
and v.brand = b.brand
and v.model = b.model
and v.band = b.band
and rank = 1

See SQLFiddle using your sample data and producing your expected output.

You could do it using a correlated subquery instead of a join, but performance would be terrible, because such queries must be executed once for every row. By using a join like this, the query to fetch the minimums is executed just once.

Upvotes: 2

user330315
user330315

Reputation:

This can easily be solved using a window function:

select test_type, 
       brand, 
       model, 
       band, 
       firmware_version, 
       avg_throughput,
       ((avg_throughput / first_value(avg_throughput) over (partition by brand, model order by firmware_version)) - 1) * 100 as diff_to_first_version
from temp_table
order by model desc, firmware_version;

You can also add the difference to the previous version not just the first version by using lag() instead of first_value()

select test_type, 
       brand, 
       model, 
       band, 
       firmware_version, 
       avg_throughput,
       ((avg_throughput / first_value(avg_throughput) over (partition by brand, model order by firmware_version)) - 1) * 100 as diff_to_first_version,
       ((avg_throughput / lag(avg_throughput) over (partition by brand, model order by firmware_version)) - 1) * 100 as diff_to_prev_version
from temp_table
order by model desc, firmware_version;

SQLFiddle example: http://sqlfiddle.com/#!15/9746f/1

This is going to be faster than the solution using a self-join on the table.

Upvotes: 2

sylvian
sylvian

Reputation: 749

After some help here I came up with this query that does what I needed.

 SELECT
   v.test_type, v.brand, v.model, v.band, v.firmware_version, v.avg_throughput,
   ROUND((100 * v.avg_throughput / (CASE b.min_avg WHEN 0 THEN NULL ELSE b.min_avg END)) - 100::numeric, 2) AS percentage
FROM temp_table v
JOIN (SELECT DISTINCT ON (test_type, model) 
test_type, brand, model, band, firmware_version, avg_throughput AS min_avg
FROM temp_table
ORDER BY test_type, model, firmware_version) b
    ON v.test_type = b.test_type
    AND v.brand = b.brand
    AND v.model = b.model
    AND v.band = b.band;

Thanks everyone for all the help!

Upvotes: 0

Related Questions