Jamie Marshall
Jamie Marshall

Reputation: 2304

Get specific columns from join without join syntax?

Is there another way to write this?

SELECT src.ID, factDeviceBuild.ID
    FROM #factDeviceBuild as src
    INNER JOIN AppsFlyer.FactDeviceBuild AS factDeviceBuild
    ON src.[DimDevice_Id] = factDeviceBuild.[DimDevice_Id] AND
        src.[DimDeviceModel_Id] = factDeviceBuild.[DimDeviceModel_Id] AND
        src.[DimPlatform_Id] = factDeviceBuild.[DimPlatform_Id] AND
        src.[DimOSVersion_Id] = factDeviceBuild.[DimOSVersion_Id] AND
        src.[DimSDKVersion_Id] = factDeviceBuild.[DimSDKVersion_Id] AND
        src.[DimCarrier_Id] = factDeviceBuild.[DimCarrier_Id] AND
        src.[DimOperator_Id] = factDeviceBuild.[DimOperator_Id]

I've been trying to do some different things (that don't work) like this

SELECT *, factDeviceBuild.ID
    FROM #factDeviceBuild
    WHERE EXISTS (
        SELECT [DimDevice_Id], [DimDeviceModel_Id], [DimPlatform_Id],
            [DimOSVersion_Id], [DimSDKVersion_Id], [DimCarrier_Id],
            [DimOperator_Id]
        FROM AppsFlyer.FactDeviceBuild AS factDeviceBuild
        )

or like this (also doesn't work):

SELECT factDeviceBuild.ID, 
        factDeviceBuild.[ID]
    FROM (
        SELECT [DimDevice_Id], [DimDeviceModel_Id], [DimPlatform_Id],
            [DimOSVersion_Id], [DimSDKVersion_Id], [DimCarrier_Id],
            [DimOperator_Id]
        FROM AppsFlyer.FactDeviceBuild AS factDeviceBuild
        INTERSECT
        SELECT [DimDevice_Id], [DimDeviceModel_Id], [DimPlatform_Id],
            [DimOSVersion_Id], [DimSDKVersion_Id], [DimCarrier_Id],
            [DimOperator_Id]
        FROM AppsFlyer.#factDeviceBuild AS factDeviceBuild
    ) AS A

I'm just playing around with some query tuning. EXCEPT and INTERSECT are particularly interesting because of the way they treat NULLS.

Obviously I could use a CROSS JOIN or OUTER JOIN to construct my INNER JOIN form scratch, but I don't see a particular gain there.

Upvotes: 0

Views: 58

Answers (3)

Paul Maxwell
Paul Maxwell

Reputation: 35603

Without either data or a visualization of the expected result, my guess is you need to "unpivot" the 7 id types into less columns, which reduces the join syntax complexity. e.g.:

select
     src.id, f.fact_id, ca.id_type, ca.id_value
from #factDeviceBuild as src
cross apply (
    values
       ('DimDevice_Id',src.[DimDevice_Id])
      ,('DimDeviceModel_Id',src.[DimDeviceModel_Id])
      ,('DimPlatform_Id',src.[DimPlatform_Id])
      ,('DimOSVersion_Id',src.[DimOSVersion_Id])
      ,('DimSDKVersion_Id',src.[DimSDKVersion_Id])
      ,('DimCarrier_Id',src.[DimCarrier_Id])
      ,('DimOperator_Id',src.[DimOperator_Id])
    ) ca (id_type, id_value)
inner join (
    select
         fact.id fact_id, ca.id_type, ca.id_value
    from AppsFlyer.FactDeviceBuild AS fact
    cross apply (
        values
           ('DimDevice_Id',fact.[DimDevice_Id])
          ,('DimDeviceModel_Id',fact.[DimDeviceModel_Id])
          ,('DimPlatform_Id',fact.[DimPlatform_Id])
          ,('DimOSVersion_Id',fact.[DimOSVersion_Id])
          ,('DimSDKVersion_Id',fact.[DimSDKVersion_Id])
          ,('DimCarrier_Id',fact.[DimCarrier_Id])
          ,('DimOperator_Id',fact.[DimOperator_Id])
        ) ca (id_type, id_value)
    where ca.id_value IS NOT NULL
    ) as f on ca.id_type = f.id_type and ca.id_value = f.id_value

Note I have not used the "unpivot" feature of TSQL as I prefer the syntax you see above. There is NO additional performance disadvantage when using this apply/values syntax.

NB: all 7 of those id type columns must be "compatible" data types for the "unpivot" to work without error. All 7 as integer for example, which would make the id_value a column of integers.

Upvotes: 0

Razvan Socol
Razvan Socol

Reputation: 5694

I believe you are looking for something like this:

SELECT src.ID, fact.ID
FROM #factDeviceBuild as src
INNER JOIN AppsFlyer.FactDeviceBuild AS fact
ON EXISTS (
    SELECT src.DimDevice_Id, src.DimDeviceModel_Id, src.DimPlatform_Id,
        src.DimOSVersion_Id, src.DimSDKVersion_Id, src.DimCarrier_Id,
        src.DimOperator_Id
    INTERSECT
    SELECT fact.DimDevice_Id, fact.DimDeviceModel_Id, fact.DimPlatform_Id,
        fact.DimOSVersion_Id, fact.DimSDKVersion_Id, fact.DimCarrier_Id,
        fact.DimOperator_Id
)

Using this INTERSECT syntax (instead of the usual conditions) has the advantage of treating NULL-s as the same values. For example, if only the DimCarrier_Id and DimOperator_Id columns would allow NULL-s, the equivalent condition would need be:

SELECT src.ID, fact.ID
FROM #factDeviceBuild as src
INNER JOIN AppsFlyer.FactDeviceBuild AS fact
ON src.DimDevice_Id = fact.DimDevice_Id AND
    src.DimDeviceModel_Id = fact.DimDeviceModel_Id AND
    src.DimPlatform_Id = fact.DimPlatform_Id AND
    src.DimOSVersion_Id = fact.DimOSVersion_Id AND
    src.DimSDKVersion_Id = fact.DimSDKVersion_Id AND
    (src.DimCarrier_Id = fact.DimCarrier_Id OR src.DimCarrier_Id IS NULL AND fact.DimCarrier_Id IS NULL) AND
    (src.DimOperator_Id = fact.DimOperator_Id OR src.DimOperator_Id IS NULL AND fact.DimOperator_Id IS NULL)

Upvotes: 2

Eralper
Eralper

Reputation: 6622

Following is same

SELECT src.ID, factDeviceBuild.ID
    FROM #factDeviceBuild as src, AppsFlyer.FactDeviceBuild AS factDeviceBuild
    WHERE
        src.[DimDevice_Id] = factDeviceBuild.[DimDevice_Id] AND
        src.[DimDeviceModel_Id] = factDeviceBuild.[DimDeviceModel_Id] AND
        src.[DimPlatform_Id] = factDeviceBuild.[DimPlatform_Id] AND
        src.[DimOSVersion_Id] = factDeviceBuild.[DimOSVersion_Id] AND
        src.[DimSDKVersion_Id] = factDeviceBuild.[DimSDKVersion_Id] AND
        src.[DimCarrier_Id] = factDeviceBuild.[DimCarrier_Id] AND
        src.[DimOperator_Id] = factDeviceBuild.[DimOperator_Id]

Upvotes: 0

Related Questions