Marian
Marian

Reputation: 4079

MySQL recursive cte base case

I have a table of events for a bunch of aggregates (order is one of them), that I can query like

SELECT * FROM events WHERE aggregate_id = :order_id AND aggregate_type = :order_type

This works fine for the simple case, but it gets a bit more involved with duplicate created event. One of the business requirements is that an order can be duplicated (conceptually, think move, not copy). This will close the original order, and then the staff can continue to work on the new order. In order to show the full history of an order, it would make sense to show not only the events for the current order, but also the events on any original orders.

Here's a simplified example:

Order 3

Order 2

Order 1

I have whipped up a rather simple query that works well for this case:

WITH RECURSIVE original_orders(order_id) AS (
  select aggregate_id
  from events
  where aggregate_type = 'order'
    and aggregate_id = :order_id
    and event_name = 'duplicate created'
  UNION ALL
  select body->'$.duplicatedFromOrderId'
  from original_orders
    inner join events on (aggregate_type = 'order' and aggregate_id = order_id and event_name = 'duplicate created')
)
SELECT events.*
FROM events
INNER JOIN original_orders ON aggregate_id = order_id;
-- extra context from the real (not simplified) use-case:
-- I actually use the CTE twice, because "notes" are stored separately
-- INNER JOIN users ON event.user_id - to get more info
-- UNION ALL SELECT * FROM notes JOIN original_orders ON order_id = notes.object_id

The recursive CTE ends up selecting order ids starting from the latest order (the base case part) and then recurses by getting the original order id from the event body. The result for the example would be a single column order_id with 3 rows: (3, 2, 1).

The problem is that if we look at the original order (Order 1 in this case), no events will be shown at all, because it doesn't have the duplicate created event. Thus, any order that isn't a duplicate selects nothing in the base case, so the CTE will return an empty result set.

I feel like my logic is somewhat flawed, and I'm missing the [obvious] simple way to get what I need. I've thought that in the base case I could instead search for ANY event name and GROUP BY id, thus getting an order_id in the base case query even if the order is not a duplicate (in a similar fashion I could use SELECT DISTINCT and ditch the GROUP BY but it's really the same thing). Like so:

WITH RECURSIVE original_orders(order_id) AS (
  select aggregate_id
  from events
  where aggregate_type = 'order'
    and aggregate_id = :order_id
  group by aggregate_id
  UNION ALL
  -- ...
)

This feels like a workaround, so I'd love to fix my logic. What am I missing?


Reproducible example:

create table events
(
    id             int auto_increment,
    aggregate_type varchar(100) not null,
    aggregate_id   varchar(255) not null,
    event_name     varchar(255) not null,
    occurred_on    datetime     not null,
    body           json         not null,
    constraint events_pk
        primary key (id)
);

insert into events (aggregate_type, aggregate_id, event_name, occurred_on, body)
values ('order', 1, 'created', '2020-05-05 09:00:00', json_object()),
       ('order', 1, 'closed', '2020-05-06 09:00:00', json_object()),
       ('order', 2, 'duplicate created', '2020-05-06 09:00:00', json_object('duplicatedFromOrderId', 1)),
       ('order', 2, 'received', '2020-05-07 09:00:00', json_object()),
       ('order', 2, 'closed', '2020-05-08 09:00:00', json_object()),
       ('order', 3, 'duplicate created', '2020-05-08 09:00:00', json_object('duplicatedFromOrderId', 2)),
       ('order', 3, 'shipped', '2020-05-09 09:00:00', json_object());

Running the query with :order_id = 3 should return all 7 events.

Running the query with :order_id = 2 should return 5 events (should ignore events for order 3).

Runnning the query with :order_id = 1 should return 2 events.

Upvotes: 0

Views: 160

Answers (1)

Marko
Marko

Reputation: 1000

WITH RECURSIVE cte as (
    select * from events
    where aggregate_id = :order_id
    and aggregate_type = 'order'
    union all
    select ev.* from cte 
    inner join events ev on ev.aggregate_id = cast(cte.body->'$.duplicatedFromOrderId' as UNSIGNED)
    where ev.aggregate_type = 'order'
)
select * from cte;

Upvotes: 2

Related Questions