Reputation: 5660

How to save state of a very complex and huge data processing?

Consider an implementation of A* algorithm.- for example: A* implementation Assume the input graph was very huge and solving this code was long enough that I thought of failure recovery in event this code crashed in between. Failures could be any - software / hardware etc. I am not looking for code, but just a few pointers into what are common solutions to such a problem of recovery

Upvotes: 3

Answers (2)

Alexey Andreev

Reputation: 2085

There are several options:

You can rewrite your algorithm to support error recovery. For example you can split it onto tasks and submit these tasks into queue. So main part of algorithm just gets tasks from queue and executes them. During execution, tasks may submit additional tasks. So, to recovery, you just need to repeat failed task execution.
Perform bytecode manipulation. Take a look to Javaflow approach. You can suspend your code execution at a certain point and then you can resume it. If something goes wrong, you just try to repeat resuming from last point.

Note that in some cases there are troubles with algorithm implementation, so restoring is just impossible. But when something wrong with external components (for example, you store something in the database) repeating may help. For example, database may be down or there is writing conflict with another transaction.

Upvotes: 3

Peter Lawrey

Reputation: 533740

When you have a potential failure of a large dataset, the normal thing to use is a redundant database. If you graph data, you might like to use neo4j which now has a pretty interface but also supports redundancy and can be used embedded to minimise latency.

If you just need high throughput persisted replication, Java Chronicle supports 5-20 million messages per second over TCP replication (up to the limit of your network bandwidth)

If none of the 150+ no sql database suit you needs you would still need to implement something like them http://nosql-database.org/

Upvotes: 2

How to save state of a very complex and huge data processing?

Answers (2)

Related Questions