Reputation: 99
I am trying to create a sql query on databricks to retrieve and tag records using some conditions.
I have a Policy table that looks like this:
PolicyId | ClientId | AgentId |
---|---|---|
P1234 | C1234 | A1234 |
P1235 | C1234 | A1235 |
P1236 | C1234 | A1236 |
P1237 | C1235 | A1234 |
P1238 | C1235 | A1234 |
P1239 | C1236 | A1237 |
P1230 | C1236 | A1238 |
P1241 | C1244 | A1238 |
P1242 | C1255 | A1238 |
I also have a Agent & Client lookup tables:
AgentId | AgentAddress | AgentFirstName | AgentLastName |
---|---|---|---|
P1234 | Test Address1 | John | Anderson |
P1235 | Test Address2 | Mike | Smith |
P1236 | Test Address3 | Peter | Smith |
P1237 | Test Address4 | Allan | Test |
P1238 | Test Address5 | Roger | Test2 |
ClientId | ClientAddress | ClientFirstName | ClientLastName |
---|---|---|---|
C1234 | Test Address1 | Client1 | LastName1 |
C1235 | Test Address2 | Client2 | LastName2 |
C1236 | Test Address3 | Client3 | Smith |
C1244 | Test Address4 | Client4 | Test |
C1255 | Test Address5 | Client5 | TEst2 |
I need to write a query in databricks sql that would determine Policies that has a Client and Agent that share the same address and flag them as orange. Also, to determine Policies that has its Client and Agent share the same address and have the same Surname and mark then as green.
A possible result set could be like this:
PolicyId | flag |
---|---|
P1234 | orange |
P1235 | orange |
P1236 | green |
P1237 | green |
I hope I explained this well, I can give more information if needed.
Thank you for your help!
Upvotes: 0
Views: 1887
Reputation: 5255
I have never used databricks but the SQL below is very standard, so I would expect it to work.
SELECT p.PolicyId,
CASE WHEN c.ClientLastName = a.AgentLastName
THEN 'green' ELSE 'orange' END AS flag
FROM Policy p
INNER JOIN Client c ON p.ClientId = c.ClientId
INNER JOIN Agent a ON p.AgentId = a.AgentId
AND a.AgentAddress = c.ClientAddress
Given your sample data this yields:
PolicyId | Flag |
---|---|
P1234 | orange |
P1242 | green |
Please note I am assuming that in your test data the AgentIds should begin with an A not a P. Also Policy P1235 does not get included, because although the client is the same as P1234 the agent is not, so the address is also not matching. Same logic applies for P1236 and P1237. P1242 is however included, as ClientId C1255 shares the same address (and LastName hence green not orange) as the AgentId A1238 (assuming case insensitive comparison).
Upvotes: 1