Reputation: 25
We did some load tests on a Saga with a In-Memory outbox. During those tests we simulated different types of failures: application restarts, infrastructure restart, message broker restart etc.
We noticed, that some saga instances did not finish and we had a bunch of errors: Automatonymous.NotAcceptedStateMachineException: ... {SomeEvent}: Not accepted in state {SomeState}
After some debuging we isolated the problem. I'll try to describe it using this sample code:
public class OrderStateMachine : MassTransitStateMachine<Order>
{
public OrderStateMachine()
{
InstanceState(x => x.CurrentState);
During(Initial,
When(Create).TransitionTo(New));
During(New,
When(AddItem)
.Then(x => x.Instance.Items.Add(x.Data.Name)),
When(Submit)
.ThenAsync(async x =>
{
// do something
await x.Publish(new SendEmail {Text = $"Order submitted. {x.Instance.Summary}"});
})
.TransitionTo(Submitted));
During(Submitted,
When(Accept)
.ThenAsync(async x =>
{
// do something
await x.Publish(new SendEmail {Text = $"Order accepted. {x.Instance.Summary}"});
})
.Finalize());
SetCompletedWhenFinalized();
}
public State New { get; private set; }
public State Submitted { get; private set; }
public Event<Create> Create { get; private set; }
public Event<AddItem> AddItem { get; private set; }
public Event<Submit> Submit { get; private set; }
public Event<Accept> Accept { get; private set; }
}
public class Order : SagaStateMachineInstance
{
public Guid CorrelationId { get; set; }
public string CurrentState { get; set; }
public IList<string> Items { get; set; } = new List<string>();
public string Summary => $"Items: {string.Join(", ", Items)}";
}
public class Create : CorrelatedBy<Guid>
{
public Guid CorrelationId { get; set; }
}
public class AddItem : CorrelatedBy<Guid>
{
public Guid CorrelationId { get; set; }
public string Name { get; set; }
}
public class Submit : CorrelatedBy<Guid>
{
public Guid CorrelationId { get; set; }
}
public class Accept : CorrelatedBy<Guid>
{
public Guid CorrelationId { get; set; }
}
public class SendEmail
{
public string Text { get; set; }
}
This is what happens:
What if this happens in state Submitted during Accept event handling? My assumption:
What is a best solution to handle situations like this? I've read Chris's great article about In-Memory Outbox, but don't understand how can the message be handled during redelivery when Saga is in a state where it no longer handles that message. Of course we can handle the redelivered event in the next state with some tricky logic, but it seems pretty cumbersome. Our Saga is much more complex then the provided sample.
Maybe a transaction which commits after all messages from outbox have been sent, would be a solution? Can the Transaction Outbox be somehow configured with a Saga?
Upvotes: 2
Views: 1476
Reputation: 33268
Since you've read the article on using the outbox, and you realize that you need to add a handler for Submit
to the Submitted
state, that's really the answer. However, unlike the original handler which updated the saga state and was persisted, you only need to regenerate the events that were sent/published. That handles the first part of the problem, Submitted.
The second part is a different answer, and it is pretty simple actually. You don't finalize the order in Accept. You create an additional state, Accepted, that the order transitions to after being accepted. And you remove the order instances after a period of time (a week, a month, whatever). That way, when the Accept message is delivered to the Accepted instance, you can regenerate the events that were published.
Now, you could use Quartz to schedule a future message to finalize the saga, which doesn't do any business logic, but only removes the saga instance. And you could set up an Initially(When(RemoveOrder).Ignore()) handler that would discard the remove order message if the saga doesn't exist. And that makes it automatic. But in past systems we've just archived a date range partition of the file group (in SQL server) or deleted the older records after 30 or 90 days or whatever.
Upvotes: 2