Brian Heward
Brian Heward

Reputation: 556

SQL slow outer apply trying to read from 2 sharded tables

I've got a list of contacts in 2 tables. My query will only ever return 1 row from 1 of the 2 tables, and based on that, the other data I want will be coming from specific other tables.

I've got a query that works, but it's taking over 30 seconds to run and I think it's because SQL is trying to load way more data than it needs to.

SELECT fe.Id as SubmitionId, 
       fe.Created, 
       case when ci1.contactId is null then ci0.contactId else ci1.contactId end as [ExperienceProfileContactId], 
       case when ci1.contactId is null then cfCustom0.FacetData else cfCustom1.FacetData end as CustomFacetData,
       case when ci1.contactId is null then cf0.FacetData else cf1.FacetData end as PersonalFacetData
from [sitecore_forms_storage].[FormEntries] fe
     outer apply (select top 1 contactId from [Sitecore_Xdb.Collection.Shard0].[xdb_collection].[ContactIdentifiers] cii0 where replace(fe.contactid,'-', '') = cii0.[Identifier]) as ci0
     outer apply (select top 1 contactId from [Sitecore_Xdb.Collection.Shard1].[xdb_collection].[ContactIdentifiers] cii1 where replace(fe.contactid,'-', '') = cii1.[Identifier]) as ci1
     left outer join [Sitecore_Xdb.Collection.Shard0].[xdb_collection].[ContactFacets] cf0 on ci0.ContactId = cf0.ContactId and cf0.FacetKey = 'Personal'
     left outer join [Sitecore_Xdb.Collection.Shard1].[xdb_collection].[ContactFacets] cf1 on ci1.ContactId = cf1.ContactId and cf1.FacetKey = 'Personal'
     left outer join [Sitecore_Xdb.Collection.Shard0].[xdb_collection].[ContactFacets] cfCustom0 on ci0.ContactId = cfCustom0.ContactId and cfCustom0.FacetKey = 'CustomData'
     left outer join [Sitecore_Xdb.Collection.Shard1].[xdb_collection].[ContactFacets] cfCustom1 on ci1.ContactId = cfCustom1.ContactId and cfCustom1.FacetKey = 'CustomData'
WHERE FormDefinitionId = '19847d65-971d-4aa9-bc11-7c0c97fc4e0f'

The FormEntries table will return about 10 rows based on the where clause at the end. The following query returns in less than a second:

SELECT * 
FROM [sitecore_forms_storage].[FormEntries] fe
WHERE FormDefinitionId = '19847d65-971d-4aa9-bc11-7c0c97fc4e0f'

The following SQL returns in less than a second and is an example of what the outer apply should be doing:

SELECT TOP 1 contactId 
FROM [Sitecore_Xdb.Collection.Shard1].[xdb_collection].[ContactIdentifiers] cii0 
WHERE '03CB66D16C3842FD94A2E75C4B0A26BA' = cii0.[Identifier]

The identifier will only be in shard0 or shard1's ContactIdentifiers table.

If the identifier is in shard1's ContactIdentifiers table, then I need to look at shard1's ContactFacets table. Otherwise look at shard0's ContactFacets table. This logic means I don't really need to outer join the other shard's ContactFacets table... but I'm not sure how to conditionally include it.

Retrieving the facets is also fast as the following query should be run once per row of the overall query and a similar query for the other facet, (i.e. about 20 times.) It also returns basically instantly:

SELECT FacetData
FROM [Sitecore_Xdb.Collection.Shard1].[xdb_collection].[ContactFacets]
WHERE ContactId = '526104FA-A5CF-0000-0000-06EC10C433D8'
  AND FacetKey = 'Personal'

It's also possible that the contact isn't in either shard. In that case I just want nulls in the last 3 columns and don't need to look at either ContactFacets table.

Is there a way to speed up this query?Query Plan

Upvotes: 0

Views: 45

Answers (1)

Brian Heward
Brian Heward

Reputation: 556

This is not the solution I was looking for, but based off the comments it was a massive improvement in speed.

I'd still be interested in finding a way to only join tables if another outer join/apply returned a value though.

I changed the first lines of my FROM clause to the following to force the REPLACE function to run on the smaller data set as [FormEntries] only had 10 rows and both [ContactIdentifiers] tables had thousands of rows:

from (select Id, FormDefinitionId, Created, replace(contactid,'-', '') as contactid
      from [sitecore_forms_storage].[FormEntries]) fe
     outer apply (select top 1 contactId from [Sitecore_Xdb.Collection.Shard0].[xdb_collection].[ContactIdentifiers] cii0 where fe.contactid = cii0.[Identifier]) as ci0
     outer apply (select top 1 contactId from [Sitecore_Xdb.Collection.Shard1].[xdb_collection].[ContactIdentifiers] cii1 where fe.contactid = cii1.[Identifier]) as ci1

Upvotes: 0

Related Questions