Mr doubt
Mr doubt

Reputation: 65

SQL query to obtain counts from two tables based on values and field names

I want to count the alerts of the candidates based on district.

Below is the district-wise alert lookup table

Table_LKP_AlertMastInfo

DistrictID             FieldName              AlertOptionValue  
  71                    AreYouMarried                 Yes
  71                      Gender                      Female
  72                    AreYouMarried                 Yes

The above Table_LKP_AlertMastInfo FieldName should compare with table_RegistrationInfo fields to check the AlertOptionValue to get counts.

Below is the candidate details table:

Table_RegistrationInfo

CandidateId    DistrictID     AreYouMarried     Gender  
 Can001            71             Yes            Female
 Can002            71             No             Female
 Can003            72             Yes            Man  
 Can004            72             No             Man    

I want output like below:

Can001   2
Can002   1
Can003   1

Explanation of the above output counts:

Can001 have selected AreYouMarried:Yes and Gender:Female then count value 2
Can002 have selected  Gender:Female then count value   1
Can003 have selected AreYouMarried:Yes then count value   1
Can004 have not alerts 

Upvotes: 5

Views: 676

Answers (6)

user10223713
user10223713

Reputation:

Not tested but this should do the trick:

SELECT      CandidateId,
  ( CASE 
    WHEN AreYouMarried = "Yes" AND Gender = 'Female' THEN 2 
    WHEN Gender = 'Female' THEN 1
    WHEN AreYouMarried = "Yes" THEN 1
    ELSE 0 END 
  ) as CandidateValue

  FROM 
    (SELECT * FROM Table_LKP_AlertMastInfo) as Alert
  LEFT JOIN 
    (SELECT * FROM Table_RegistrationInfo) as Registration
  ON (Alert.DistrictID = Registration.DistrictID);

This should give you a list with candidateId matching the condition count

Upvotes: 0

MarcinJ
MarcinJ

Reputation: 3639

This won't be possible without dynamic SQL if your data is modeled like it is, i.e. key-value pairs in Table_LKP_AlertMastInfo and columns in Table_RegistrationInfo. So with that out of our way, let's do it. Full code to the stored procedure providing the exact results you need is at the end, I'll follow with the explanation on what it does.

Because the alerts are specified as key-value pairs (field name - field value), we'll first need to get the candidate data in the same format. UNPIVOT can fix this right up, if we can get it the list of the fields. Had we only had only the two fields you mention in the question, it would be rather easy, something like:

SELECT CandidateId, DistrictID
     , FieldName
     , FieldValue
  FROM Table_RegistrationInfo t
  UNPIVOT (FieldValue FOR FieldName IN (AreYouMarried, Gender)) upvt

Of course that's not the case, so we'll need to dynamically select the list of the fields we're interested in and provide that. Since you're on 2008 R2, STRING_AGG is not yet available, so we'll use the XML trick to aggregate all the fields into a single string and provide it to the query above.

DECLARE @sql NVARCHAR(MAX)
SELECT @sql = CONCAT('SELECT CandidateId, DistrictID
     , FieldName
     , FieldValue
  FROM Table_RegistrationInfo t
  UNPIVOT (FieldValue FOR FieldName IN (',
    STUFF((
          SELECT DISTINCT ',' + ami.FieldName
          FROM Table_LKP_AlertMastInfo ami
          FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, ''), ')) upvt')

PRINT @sql

This produces almost the exact output as the query I wrote. Next, we need to store this data somewhere. Temporary tables to the rescue. Let's create one and insert into it using this dynamic SQL.

CREATE TABLE #candidateFields
(
    CandidateID VARCHAR(50),
    DistrictID  INT,
    FieldName   NVARCHAR(200),
    FieldValue  NVARCHAR(1000)
);

INSERT INTO #candidateFields
EXEC sp_executesql @sql

-- (8 rows affected)
-- We could index this for good measure
CREATE UNIQUE CLUSTERED INDEX uxc#candidateFields on #candidateFields
(
     CandidateId, DistrictId, FieldName, FieldValue
);

Great, with that out of the way, we now have both data sets - alerts and candidate data - in the same format. It's a matter of joining to find matches between both:

SELECT cf.CandidateID, COUNT(*) AS matches
  FROM #candidateFields cf
 INNER
  JOIN Table_LKP_AlertMastInfo alerts
    ON alerts.DistrictID = cf.DistrictID
   AND alerts.FieldName = cf.FieldName
   AND alerts.AlertOptionValue = cf.FieldValue
 GROUP BY cf.CandidateID

Provides the desired output for the sample data:

CandidateID                                        matches
-------------------------------------------------- -----------
Can001                                             2
Can002                                             1
Can003                                             1

(3 rows affected)

So we can stitch all that together now to form a reusable stored procedure:

CREATE PROCEDURE dbo.findMatches
AS
BEGIN
    SET NOCOUNT ON;

    DECLARE @sql NVARCHAR(MAX)
    SELECT @sql = CONCAT('SELECT CandidateId, DistrictID
         , FieldName
         , FieldValue
      FROM Table_RegistrationInfo t
      UNPIVOT (FieldValue FOR FieldName IN (',
        STUFF((
              SELECT DISTINCT ',' + ami.FieldName
              FROM Table_LKP_AlertMastInfo ami
              FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, ''), ')) upvt')

    CREATE TABLE #candidateFields
    (
        CandidateID VARCHAR(50),
        DistrictID  INT,
        FieldName   NVARCHAR(200),
        FieldValue  NVARCHAR(1000)
    );

    INSERT INTO #candidateFields
    EXEC sp_executesql @sql


    CREATE UNIQUE CLUSTERED INDEX uxc#candidateFields on #candidateFields
    (
         CandidateId, DistrictId, FieldName
    );

    SELECT cf.CandidateID, COUNT(*) AS matches
      FROM #candidateFields cf
      JOIN Table_LKP_AlertMastInfo alerts
        ON alerts.DistrictID = cf.DistrictID
       AND alerts.FieldName = cf.FieldName
       AND alerts.AlertOptionValue = cf.FieldValue
     GROUP BY cf.CandidateID

END;

Execute with

EXEC dbo.findMatches

You'd of course need to adjust types and probably add a bunch of other things here, like error handling, but this should get you started on the right path. You'll want a covering index on that alert table and it should be pretty fast even with a lot of records.

Upvotes: 6

Nagashree Hs
Nagashree Hs

Reputation: 853

I am not sure if this can be completely done using SQL. If you are using some backend technology such as ADO.NET, then you can store the results in Datatables. Loop through the column names and do the comparison.

Dynamic SQL can be used to make Table_LKP_AlertMastInfo look like Table_RegistrationInfo. This script can be used in a stored procedure and results can be retrieved in a Datatable.

DECLARE @SQL NVARCHAR(MAX)

DECLARE @PivotFieldNameList nvarchar(MAX)

SET @SQL = ''
SET @PivotFieldNameList = ''
SELECT @PivotFieldNameList = @PivotFieldNameList + FieldName + ', '
FROM (SELECT DISTINCT FieldName FROM Table_LKP_AlertMastInfo) S

SET @PivotFieldNameList = SUBSTRING(@PivotFieldNameList, 1, LEN(@PivotFieldNameList) - 1) 
--SELECT @PivotFieldNameList


SET @SQL = '  SELECT DistrictId, ' + @PivotFieldNameList + ' FROM 
 Table_LKP_AlertMastInfo 
 PIVOT
 (   MAX(AlertOptionValue) 
    FOR FieldName IN (' + @PivotFieldNameList + '
  ) ) AS p  '
PRINT @SQL
EXEC(@SQL)

Above query results like below

DistrictId  AreYouMarried   Gender
 71         Yes             Female
 72         Yes             NULL

If you get results from Table_RegistrationInfo into another Datatable, then both can be used for comparison.

Upvotes: 1

Saad Ahmad
Saad Ahmad

Reputation: 403

I asusme that with 100 fields you have a set of alerts which are a combinatioin of values. Further I assume that you can have a select list in a proper order all the time. So

select candidateid,
       AreyouMarried || '|' || Gender all_responses_in_one_string
from ....

is psssible. So above will return

candidateid  all_responses_in_one_string
can001       Yes|Female
can002       No|Female

So now your alert can be a regular expression for the concatenated string. And your alert is based on how much you matched.

Upvotes: 2

Steve Chambers
Steve Chambers

Reputation: 39464

Here is one simple way of doing this:

SELECT subq.*
FROM
(SELECT CandidateId,
        (SELECT COUNT(*)
         FROM Table_LKP_AlertMastInfo ami
         WHERE ami.DistrictID = ri.DistrictID
           AND ami.FieldName ='AreYouMarried'
           AND ami.AlertOptionValue = ri.AreYouMarried) +
        (SELECT COUNT(*)
         FROM Table_LKP_AlertMastInfo ami
         WHERE ami.DistrictID = ri.DistrictID
           AND ami.FieldName ='Gender'
           AND ami.AlertOptionValue = ri.Gender) AS [count]
 FROM Table_RegistrationInfo ri) subq
WHERE subq.[count] > 0;

See SQL Fiddle demo.

Upvotes: 1

Pratik Bhavsar
Pratik Bhavsar

Reputation: 858

I managed to get the expected result without using dynamic queries. Not sure if this is what you are looking for:

 SELECT DISTINCT 
    c.CandidateId, SUM(a.AreYouMarriedAlert + a.GenderAlter) AS AlterCount
FROM 
    Table_RegistrationInfo c
OUTER APPLY
(
    SELECT 
        CASE 
            WHEN a.FieldName = 'AreYouMarried' AND c.AreYouMarried = a.AlertOptionValue THEN 1 
            ELSE 0 
        END AS AreYouMarriedAlert,
        CASE 
            WHEN a.FieldName = 'Gender' AND c.Gender = a.AlertOptionValue THEN 1 
            ELSE 0 
        END AS GenderAlter
    FROM 
        Table_LKP_AlertMastInfo a 
    WHERE 
        a.DistrictID = c.DistrictID
) a
GROUP BY c.CandidateId
HAVING SUM(a.AreYouMarriedAlert + a.GenderAlter) > 0 

Results:

enter image description here

Upvotes: 2

Related Questions