Reputation: 4624
We are working an algorithm that calculate the optimal way to move resources from multiple points to point X through variable routes and the process go like this:
1) get all the possible routes (DB hit to get all the routes involved in solution)
2) Get all the possible starting points
3) Build a Bidirectional Graph combining all the routes.
-----foreach starting point----
4) Compute the k-shortest path using Hoffman Pavley algorithm (we limit this to a certain number of paths ei: the first 10 shortes paths)
-----foreach path for the actual starting point-----
5) evaluate the route calculating how much resources we can carry from every route node to the destination
6) Assign a punctuation depending on the numbers of resources moved form each point and the numbers of moves and trans-shipments (move resources from one transportation to another) involved in this possible solution.
-----END foreach path for the actual starting point-----
-----END foreach starting point----
7) return the posible solution ordered by punctuation
The first version of this logic took ~1min to calculate solutions. But on a second revision we found that we got a lot of Select N+1 problems, so we optimized the querys (not all of them) and now each run takes ~ 3-10 secs, depending on the numbers of variables.
But now someone suggested to pass all that logic to transact SQL, and let SQL server handle all that calculation, he said as all the data is already on the SQL Server it will take less time for the database to do all the calculation avoiding all the select N+1 and lazy loading problems. Also he concern about concurrency, multiple users running this logic will bring down the app server, but he said the sql-server can handle very well this kind of loads.
My opinion: maybe we should try to optimize all the querys before trying to pass 1500 lines of c# logic to Transact SQL. And not to mention that for some calculation we are using third party libraries for the Bidirectional Graph and Hoffman Pavley algorithm that are not available in transact, either we need to look for something else already wrote in transact or implement all that logic ourself.
NOTE: we are using Nhibernate as ORM.
Upvotes: 1
Views: 542
Reputation: 611
I would agree to "I would only consider moving the logic to the database as a last resort." written above.
Third party libraries can be included into Transact SQL if you use CLR assemblies, so this is not a problem.
From resources perspective it is usually easier to expand your application servers than your database server (replication?). So if tomorrow those calls go X 10 or X 50 of today's calls, are we sure your database server will still do the calculations and anything else at acceptable times?
From performance perspective, you go down from 1 min to 5 sec with just optimizing SQL. Obviously if you used the non optimized SQL in an SQL alone engine you still had a difference than using your optimized SQL - again in an SQL only engine.
I would suggest to focus on optimizing SQL and the engine at c#. Those N+1 cases i guess are the backbone and you cannot get the record before you finalize the previous one. Still anything you can select in advance is a performance gain - you better get 10 records with 3 selects returning a total of 1000 (filtering the 10 within C#) records than with 10 selects returning a total of 10 records.
Upvotes: 0
Reputation: 416059
Here's the deal:
Shifting logic to the database can often lead to improved performance on complex report requirements such as yours. This is accomplished through better indexing of the data, such that the index means much of the work (namely: sorting) is done for you at insert time.
Since the sorting work is done at insert time for the indexes you need, you end up with slower inserts and other write operations. This can often be detrimental on a system that needs to do more than just your reports.
Additionally, at some point you'll want to think about how your app scales. When you do that, consider that your database server is likely already your most expensive server, as well as the most expensive server to upgrade. Licensing costs alone will make upgrading your database server less palatable to your budget manager. Databases are also typically harder to get working in a cluster. Compared to the database, adding web or application servers and getting them to work in a farm is a walk in the park. For these reasons, anything you can do to release performance pressure from your database is likely to improve the way your app will scale.
Upvotes: 1
Reputation: 24344
It's hard to provide insight on an optimization problem that is so general, but the statement:
"as all the data is already on the SQL Server it will take less time for the database to do all the calculation"
is not necessarily true. A straight port of your C# code to t-sql will still run just as many queries that will take just as long to run if you don't change the logic at all. You will save on the time it takes to transfer data between the SQL server and the machine running the app, but is that the bottleneck, or is it the time it takes the SQL server to actually run all these queries? How big are the results of each of these queries?
The other question is would t-sql be any faster at doing all the calculations involved here, to the extent that they involve iterating through data in tables and doing something with that data? I doubt it. Depending on how much of the time is actually processing (rather than waiting for the database) it could even be slower.
Bottom line is, it sounds like translating it would be a huge effort, if you are even remotely considering this approach you should do a lot of tests to determine exactly where the time is going and see what you could stand to gain, if anything.
Upvotes: 1
Reputation: 28728
I would only consider moving the logic to the database as a last resort.
A good guide is to keep set based processing in the database, and iterated processing in the application. You've got a number of foreach statements and unless they can be flattened into set operations you are really going to suffer in the database world.
If this is the application of a business rule then it should be in the application layer, unless there are reasons to put it in the database.
It will take a lot of time to port your 1500 lines to code to TSQL. You can use the .NET CLR if it's a recent version of MSSQL, but in my experience that's significantly slower than .NET on a Windows Server
It should be relatively simple to pull all your required data up front to avoid the N+1 selects; get everything you'll need and join it all into an appropriate object graph.
Finally, it seems like the first 4 steps are replicated for all requests. It might make sense to select all the data and process those first four steps and then keep the graph in memory, avoiding the significantly upfront hit of retrieving and preprocessing everything for each request. This may not be possible but would remove the data retrieval problem altogether.
Upvotes: 2
Reputation: 4165
Moving logic to SQL may help but it has the cost:
So my opinion is that you should try to optimize your queries before migrating all logic to database.
Upvotes: 2