Will Scott
Will Scott

Reputation: 77

Writing subqueries in SQL

I'm learning PostgreSQL and I have a database log that has these columns: path text, ip inet, method text, status text, time timestamptz, id int

Using this database my goal is to find any days that have a error percent (status != '200 OK' / status) of over 2.5 percent. I have the query, seen below, that gets to the error percent as a decimal but I can't figure out how to multiply them by 10 to make them a 'percent' nor how to check if they are over 2.5.

I have tried both HAVING and WHERE in multiple spots but I either get a Programming error at group by or a syntax error wherever I enter the HAVING or WHERE. What's the correct way to change it to a percent and check if its over 2.5? Thanks in advance

Mock data:

CREATE TABLE log (
    path text,
    ip inet,
    method text,
    status text,
    "time" timestamp with time zone DEFAULT now(),
    id integer NOT NULL
);

INSERT INTO log VALUES ('/article/bears-love-berries', '198.51.100.76' , 'GET', '200 OK', '2016-07-1 12:54:22+00', 3355597);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '198.51.100.76' , 'GET', '200 OK', '2016-07-1 19:54:22+00', 3355598);
INSERT INTO log VALUES ('/article/bears-love-berries', '198.51.100.76' , 'GET', '404 NOT FOUND', '2016-07-1 22:54:22+00', 3355599);
INSERT INTO log VALUES ('/article/goats-eat-googles', '203.0.113.42' , 'GET', '200 OK', '2016-07-11 12:54:22+00', 3355600);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '203.0.113.42' , 'GET', '200 OK', '2016-07-11 17:54:22+00', 3355601);
INSERT INTO log VALUES ('/article/bears-love-berries', '203.0.113.42' , 'GET', '200 OK', '2016-07-11 19:54:22+00', 3355602);
INSERT INTO log VALUES ('/article/goats-eat-googles', '203.0.113.42' , 'GET', '404 NOT FOUND', '2016-07-15 13:54:22+00', 3355603);
INSERT INTO log VALUES ('/article/bears-love-berries', '198.51.100.34' , 'GET', '200 OK', '2016-07-15 20:54:22+00', 3355604);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '198.51.100.34' , 'GET', '200 OK', '2016-07-15 23:54:22+00', 3355605);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '203.0.113.12' , 'GET', '200 OK', '2016-07-21 19:54:22+00', 3355606);
INSERT INTO log VALUES ('/article/bears-love-berries', '203.0.113.12' , 'GET', '200 OK', '2016-07-21 12:54:22+00', 3355607);
INSERT INTO log VALUES ('/article/goats-eat-googles', '198.51.100.76' , 'GET', '200 OK', '2016-07-21 01:54:22+00', 3355608);
INSERT INTO log VALUES ('/article/goats-eat-googles', '198.51.100.76' , 'GET', '404 NOT FOUND', '2016-07-29 07:54:22+00', 3355609);
INSERT INTO log VALUES ('/article/candidate-is-jerk', '198.51.100.76' , 'GET', '404 NOT FOUND', '2016-07-29 20:54:22+00', 3355610);
INSERT INTO log VALUES ('/article/goats-eat-googles', '198.51.100.34' , 'GET', '200 OK', '2016-07-29 19:54:22+00', 3355611);

Query:

SELECT DATE(time) AS day,
(
  (SELECT CAST(COUNT(status) AS FLOAT)
  FROM log WHERE status != '200 OK') /
  (SELECT CAST(COUNT(status) AS FLOAT))
) AS views
FROM log GROUP BY day
ORDER BY views

Upvotes: 1

Views: 84

Answers (1)

Avenger
Avenger

Reputation: 365

After sharing a test case in sqlfiddle, we could get down to a running code that would do it.

The fiddle is here: http://sqlfiddle.com/#!15/b3ed9/39

Instead of multiplicating the result by 100 you can just check against the already-multiplied 0.025, no? I mean, if you have values between 0-1, the equivalent to 2.5% would be just 0.025.

Thus that command should do:

select views.day, views.cnt from
 (select datelog.day,
  (
   cast(count(case when datelog.status != '200 OK' then 1 else null end) as float)
   /
   cast(count(*) as float)
  ) as cnt
  from (select DATE(time) as day, status from log) as datelog
  group by datelog.day
 ) as views
 where views.cnt > 0.025

The last revision was deeply founded in the reply here: SQL use column from subselect in where clause

Upvotes: 1

Related Questions