Alberto De Caro
Alberto De Caro

Reputation: 5213

Databricks/Spark SQL - how to fetch data in a map type field

In a CDP datalake the profiles table features the identityMap field:

[IdentityType -> IdentityValue]

A sample value looks like the following:

["crmID" -> "{(NULL,crm123,NULL)}", 
 "loyaltyID" -> "{(NULL,loy456,NULL)}", 
 "accountID" -> "{(NULL,act789,NULL)}"]

To validate the ingestion and merging process of the IDs the client would like to run some checks on that map field such as:

I used to be proficient on TSQL, but those maps structure are new to me and not sure how to explore those data.

Which kind of functions or syntax can be used to fetch the identityMap field? Please feel free to share just only references to public documentation.

Upvotes: 2

Views: 3542

Answers (1)

Alex Ott
Alex Ott

Reputation: 87249

If you need to do that using SQL only, then you need to use square brackets [] to access values (see docs). They work with both arrays & maps, just for arrays you need to provide index, while for maps you need to provide value, like, column_name['crmID'].

P.S. You may also need to use functions like, map_keys, map_values, etc. - just follow the links from the bottom of the linked page

Upvotes: 2

Related Questions