Sergej Kosov
Sergej Kosov

Reputation: 128

How to find neighboring records in the SQL table in terms of month and year?

Please help me to optimize my SQL query.

I have a table with the fields: date, commodity_id, exp_month_id, exp_year, price, where the first 4 fields are the primary key. The months are designated with the alphabet-ordered letters: e.g. F (for Jan), G (for Feb.), H (for March), etc. Thus the letter of more distant from Jan. month will be larger than the letter of the less distant month (F < G < H < ...). Some commodity_ids have all 12 months in the table, some only 5 or 3, which are constant for all years.

I need to calculate the difference between prices (gradient) of the neighboring records in terms of exp_month_id, exp_year. As the first step, I want to define for every couple (exp_month_id, exp_year) the valid couple (next_month_id, next_year). The main problem here, that if the current exp_month_id is the last in the year, then next_year = exp_year + 1 and next_month_id should be the first one in the year.

I have written the following query to do the job:

WITH trading_months AS (
    SELECT DISTINCT commodity_id,
                    exp_month_id
      FROM futures
     ORDER BY exp_month_id
)
SELECT DISTINCT f.commodity_id,
                f.exp_month_id,
                f.exp_year,
                (
                WITH [temp] AS (
                        SELECT exp_month_id
                          FROM trading_months
                         WHERE commodity_id = f.commodity_id
                    )
                    SELECT exp_month_id
                      FROM [temp]
                     WHERE exp_month_id > f.exp_month_id
                    UNION ALL
                    SELECT exp_month_id
                      FROM [temp]
                     LIMIT 1
                )
                AS next_month_id,
                (
                    SELECT CASE WHEN EXISTS (
                                   SELECT commodity_id,
                                          exp_month_id
                                     FROM trading_months
                                    WHERE commodity_id = f.commodity_id AND 
                                          exp_month_id > f.exp_month_id
                                    LIMIT 1
                               )
                           THEN f.exp_year ELSE f.exp_year + 1 END
                )
                AS next_year
  FROM futures AS f

This query serves as a base for a dynamic table (view) which is subsequently used for calculating the gradient. However, the execution of this query takes more than one second and thus the whole process takes minutes. I wonder if you could help me optimizing the query.

Upvotes: 0

Views: 66

Answers (2)

Sergej Kosov
Sergej Kosov

Reputation: 128

Thanks to the hint of @Shawn to use window functions I could rewrite the query in much shorter form:

CREATE VIEW "futures_nextmonths_win" AS
WITH trading_months AS (
    SELECT DISTINCT commodity_id,
                    exp_month_id,
                    exp_year
    FROM futures)
SELECT commodity_id,
       exp_month_id,
       exp_year,
       lead(exp_month_id) OVER w AS next_month_id,
       lead(exp_year) OVER w AS next_year
FROM trading_months
WINDOW w AS (PARTITION BY commodity_id ORDER BY exp_year, exp_month_id);

which is also slightly faster then the original one.

Upvotes: 0

Shawn
Shawn

Reputation: 52569

Note: The following requires Sqlite 3.25 or newer for window function support:

Lack of sample data (Preferably as a CREATE TABLE and INSERT statements for easy importing) and expected results makes it hard to test, but if your end goal is computing the difference in prices between expiration dates (Making your question a bit of an XY problem, maybe something like:

SELECT date, commodity_id, price, exp_year, exp_month_id
     , price - lag(price, 1) OVER (PARTITION BY commodity_id ORDER BY exp_year, exp_month_id) AS "change from last price"
FROM futures;

Upvotes: 1

Related Questions