Corssica
Corssica

Reputation: 1

How to query multiple, unrelated tables in parallel in SQL Server?

I am given a list of table names and would like to get the distinct values of a specific column from each of these tables. For example:

Input = TableA, TableD, TableX

Output =

Table Name   |    Value

TableA       |    1

TableA       |    2

TableD       |    1

TableX       |    3

Since these tables are unrelated, it would be great if I could query them in parallel to save some time. I think querying them in one large UNION statement will let the SQL Server engine automatically parallelize the query, but I'd like to avoid this. Some of these tables are large, and it's possible querying them might fail on occasion due to memory/resource issues. So if I did a UNION, one table failing will result in the entire query failing.

Is my only option to handle this on the application side? (i.e make parallel calls to the same sproc for each table in the input list)

Thanks for any help and please let me know if the goal is unclear.

Upvotes: 0

Views: 451

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269703

This is too long for a comment.

You are worried about SQL Server running out of memory but not your application? You seem to have a limited understanding of databases and resources. Or, you have a very strange environment, with very few resources on the database server and lots of resources on the application side.

Technically, you seem to want:

select distinct 'tableA' as tablename, value from tableA union all
select distinct 'tableB' as tablename, value from tableB union all
select distinct 'tableC' as tablename, value from tableC . . .;

It is not possible for different subqueries to have conflicting rows, because the table name distinguishes among them. Hence, don't incur the overhead of duplicate removal by using union.

This version of the query can also take advantage of indexes . . . on tableA(value), and so on. That might be very convenient.

As far as I know, SQL Server does not run union all queries in parallel. This is unfortunate. So, it is possible that a multi-threaded application that sends each query independently to the database would have better performance. However, constructing such an application can be tricky, when you have to wait for all threads to complete in order to combine results.

Upvotes: 1

Related Questions