neil_ruaro
neil_ruaro

Reputation: 506

How to get latest date in distinct city in PostgreSQL

I'm currently trying to do a task on PostgreSQL right now where I need to prepare a list of cities with their date of last reservation which also must include a hotel ID, and a photo.

Here's my SQL query thus far:

SELECT city.name, booking.booking_date as last_booking_date, hotel.id as hotel_id, hotel.photos ->> 0 as hotel_photo
FROM city 
INNER JOIN hotel
    ON city.id = hotel.city_id 
INNER JOIN booking 
    ON booking.hotel_id = hotel.id
WHERE 
    booking.booking_date = (SELECT MAX(booking_date) FROM booking WHERE hotel_id = hotel.id)
ORDER BY city.name ASC;

Now what this does is return the cities (sorted in alphabetical order, and can duplicate) which also contains their booking dates (also sorted from earliest to latest). If many hotels have the same amount of bookings, then I'll have to sort them by ID

Thanks to the answer below, and with some minor tweaks, I was able to return the distinct cities in alphabetical order, but not by their latest date for some cities.

SELECT distinct on (city.name) 
  city.name, 
  booking.booking_date as last_booking_date, 
  hotel.id as hotel_id, 
  hotel.photos ->> 0 as hotel_photo
FROM city 
INNER JOIN hotel
    ON city.id = hotel.city_id 
INNER JOIN booking 
    ON booking.hotel_id = hotel.id
WHERE 
    booking.booking_date = (SELECT MAX(booking_date) 
                            FROM booking
                            WHERE hotel_id = hotel.id)
    AND 
    booking.id = (SELECT MAX(id) FROM booking WHERE hotel_id = hotel.id)
ORDER BY city.name, hotel.id ASC, booking.booking_date ASC;

This has the following result, with the expected result shown as well. enter image description here

As you can see, some cities are not sorted by their latest date of booking. I've tried variations of the SQL query above such as:

SELECT distinct on (city.name) 
  city.name, 
  booking.booking_date as last_booking_date, 
  hotel.id as hotel_id, 
  hotel.photos ->> 0 as hotel_photo
FROM city 
INNER JOIN hotel
    ON city.id = hotel.city_id 
INNER JOIN booking 
    ON booking.hotel_id = hotel.id
WHERE 
    booking.booking_date = (SELECT MAX(booking_date) 
                            FROM booking
                            WHERE hotel_id = hotel.id)
    AND 
    booking.hotel_id = (SELECT MAX(hotel_id) FROM booking WHERE hotel_id = hotel.id)
ORDER BY city.name, hotel.id ASC, booking.booking_date ASC;

and

SELECT distinct on (city.name) 
  city.name, 
  booking.booking_date as last_booking_date, 
  hotel.id as hotel_id, 
  hotel.photos ->> 0 as hotel_photo
FROM city 
INNER JOIN hotel
    ON city.id = hotel.city_id 
INNER JOIN booking 
    ON booking.hotel_id = hotel.id
WHERE 
    booking.booking_date = (SELECT MAX(booking_date) 
                            FROM booking
                            WHERE hotel_id = hotel.id)
    AND 
    booking.id = (SELECT MAX(id) FROM booking WHERE hotel_id = hotel.id)
    AND 
    booking.hotel_id = (SELECT MAX(hotel_id) FROM booking WHERE hotel_id = hotel.id)
ORDER BY city.name, hotel.id ASC, booking.booking_date ASC;

But they all show the same result as the one earlier.

With that, how do I get the row with the latest date? For context, this is what the first query returns. enter image description here

Upvotes: 0

Views: 349

Answers (3)

Jay
Jay

Reputation: 1

I saw a similar problem the other day something like this

Write a query to retrieve the city name, the date of the most recent reservation in that city, the hotel ID of the most popular hotel (the one with the highest number of bookings), and a photo of that hotel for each city. The results should be sorted in ascending order by the city name. If two hotels have the same number of bookings, they should be sorted in ascending order by hotel ID.

Problem Statement

Database Schema

CREATE TABLE city
(
    id INT PRIMARY KEY,
    name VARCHAR(50) NOT NULL
);

CREATE TABLE hotel
(
    id INT PRIMARY KEY,
    city_id INT NOT NULL REFERENCES cityl,
    name VARCHAR(50) NOT NULL,
    day_price NUMERIC(8, 2) NOT NULL,
    photos JSONB DEFAULT '[]'
);


CREATE TABLE booking
(
    id int PRIMARY KEY,
    hotel_id INT NOT NULL REFERENCES hotel,
    booking_date DATE NOT NULL,
    start_date DATE NOT NULL,
    end_date DATE NOT NULL
);

DESCRIPTION

Your task is to prepare a list of cities with the date of last reservation made in the city and a main photo (photos[0]) of the most popular (by number of bookings) hotel in this city.

Sort results in ascending order by city. If many hotels have the same amount of bookings sort them by ID (ascending order). Remember that the query will also be run of different datasets.

Example Output

ex_output

Sample Data

Table: city

city_table

Table: hotel

hotel_table

Table: booking

booking_table

Expected output

output_table

Solution


with bookingCount as (
    select
        h.id hotel_id,
        h.city_id,
        h.photos ->> 0 AS hotel_photo,
        count(b.id) b_count
    from hotel h
    join booking b
    on h.id = b.hotel_id
    group by h.id
)
select
    sb1.name city_name,
    sb1.last_booking_date,
    sb2.hotel_id,
    hotel_photo
from
(
    select
        c.id city_id,
        c.name ,
        max(b.booking_date) last_booking_date
    from city c
    join hotel h
    on h.city_id = c.id
    join booking b
    on b.hotel_id = h.id
    group by c.id
) sb1
join
(
    select
        bc1.city_id,
        min(bc1.hotel_id) hotel_id,
        hotel_photo
    from
    bookingCount bc1
    where bc1.b_count = (
        SELECT MAX(b_count)
        FROM BookingCount bc2
        WHERE bc1.city_id = bc2.city_id
    )
    group by
        bc1.city_id,
        hotel_photo
) sb2
on sb2.city_id = sb1.city_id
order by
    city_name,
    hotel_id
;

Upvotes: 0

Maruskya
Maruskya

Reputation: 1

I believe, you do not need the WHEN subquery. Your ORDER BY statement is slightly off: it first sorts by city, then by hotel and only then choses a date. So, if you modify ORDER BY to be by city.name, date DESC and then hotel_id it should make the work.
Find the code below:

SELECT distinct on (city.name) 
  city.name, 
  booking.booking_date as last_booking_date, 
  hotel.id as hotel_id, 
  hotel.photos ->> 0 as hotel_photo
FROM city 
INNER JOIN hotel
    ON city.id = hotel.city_id 
INNER JOIN booking 
    ON booking.hotel_id = hotel.id
ORDER BY city.name, booking.booking_date DESC, hotel.id ASC;

Upvotes: 0

The Impaler
The Impaler

Reputation: 48810

Use DISTINCT ON. It will pick the first element for the subset, according to the ordering in the ORDER BY clause. For example:

SELECT distinct on (city.name) -- changed here
  city.name, 
  booking.booking_date as last_booking_date, 
  hotel.id as hotel_id, 
  hotel.photos ->> 0 as hotel_photo
FROM city 
INNER JOIN hotel
    ON city.id = hotel.city_id 
INNER JOIN booking 
    ON booking.hotel_id = hotel.id
WHERE 
    booking.booking_date = (SELECT MAX(booking_date) 
                            FROM booking
                            WHERE hotel_id = hotel.id)
ORDER BY city.name, booking.booking_date DESC; -- changed here

Upvotes: 1

Related Questions