Ehsan
Ehsan

Reputation: 711

Dynamic Creation of External Tables in Synapse

I have a json string which is the result of Get Metadata activity in data factory in the following format: {"structure": [{"name": "Id","type": "String"},{"name": "IsDeleted","type": "Boolean"},{"name": "RowVersion","type": "Byte[]"}, {"name": "ModelId","type": "String"}]

This json is showing the schema of parquet file. I pass this json along with the parquet filename and uctnow() to a stored procedure to create a table of schema for that file with the following code

CREATE OR ALTER PROCEDURE [CreateExternalTables] (
@schema NVARCHAR (MAX), @tableName NVARCHAR(MAX), @ExecTime NVARCHAR(MAX)
) AS

BEGIN
IF OBJECT_ID('tempdb..#tables_to_create', 'U') IS NOT NULL
        DROP TABLE #tables_to_create
CREATE TABLE #tables_to_create (
    tableName NVARCHAR (MAX),
    fieldOrder NVARCHAR (MAX),
    fieldName NVARCHAR (MAX),
    fieldType NVARCHAR (MAX),
    translatedType NVARCHAR (MAX),
    executeTime NVARCHAR (MAX)
)

BEGIN

WITH Fields (fieldOrder, fieldName, fieldType) AS (
  SELECT
    [key] AS fieldOrder,
    JSON_VALUE([value], '$.name') AS fieldName,
    JSON_VALUE([value], '$.type') AS fieldType
  FROM
    OPENJSON(@schema)
)
INSERT INTO
  #tables_to_create(
    tableName,
    fieldOrder,
    fieldName,
    fieldType,
    translatedType,
    executeTime
  )
SELECT
  @tableName,
  fieldOrder,
  fieldName,
  fieldType,
   CASE
    WHEN fieldType = 'Single' THEN 'real'
    WHEN fieldType = 'Boolean' THEN 'bit'
    WHEN fieldType = 'Double' THEN 'float'
    WHEN fieldType = 'Int64' THEN 'bigint'
    ELSE NULL  
  END AS translatedType,
  @ExecTime
FROM
  Fields
END
END;

The overall setup is like below: enter image description here

But I receive the following error which I'm not sure why: enter image description here

What I would like to do is to create this temo table in stored procedure in order to create external tables in a automatic way.

To be precise I'm following this link to do it: create-external-dataset

Any help is highly appreciated.

UPDATE After solving the first problem with @wBob 's help I follow the rest of the link to create the external tables with the following which I inserted into my stored procedure:

FROM
  Fields
END
Declare @sqlCommand nvarchar(max);
Declare @folderPath nvarchar(max);

SET
  @sqlCommand = 'IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N''[dbo].[' + @tableName + ']'') AND type in (N''U''))
  CREATE EXTERNAL TABLE [dbo].[' + @tableName + '] (' 

WHILE((SELECT COUNT(*) FROM #tables_to_create) > 0)
BEGIN
  DECLARE @key int
  SELECT
    @key = MIN(fieldOrder)
  FROM
    #tables_to_create
  WHERE
    executeTime = @ExecTime
     
  DECLARE @fieldName VARCHAR(50)  
  DECLARE @translatedType VARCHAR(50)
 
  SELECT
    @fieldName = fieldName,
    @translatedType = translatedType
  FROM
    #tables_to_create
  WHERE
    fieldOrder = @key
    AND executeTime = @ExecTime
 
  SET
    @sqlCommand = @sqlCommand + '
          [' + @fieldName + '] [' + @translatedType + '] NULL'
 
  DELETE FROM
    #tables_to_create
  WHERE
    fieldOrder = @key
    AND executeTime = @ExecTime
 
  IF((SELECT COUNT(*) FROM #tables_to_create WHERE executeTime = @ExecTime) > 0)
    SET
      @sqlCommand = @sqlCommand + ', '
END
 
SET
  @sqlCommand = @sqlCommand + '
  )
  WITH
  (
    LOCATION = ''/' + /main/json/ + ''',
    DATA_SOURCE = DataLakeStaged,
    FILE_FORMAT = StagedParquet
  )'
   
EXEC(@sqlCommand)
END;

However I receive the following error: enter image description here

Upvotes: 0

Views: 3208

Answers (1)

wBob
wBob

Reputation: 14379

As per the error message, queries are not supported in the conditions of WHILE statement in Azure Synapse Analytics. Dedicated SQL pools are MPP systems that have some slight differences from box-product SQL Server and Azure SQL DB and more generally, loops are a pattern that don't translate that well. You should consider creating a stored procedure that does not loop and leave any looping required to the For Each activity in Azure Data Factory (ADF) / Synapse Pipelines which can loop in parallel.

If you do require a loop, there are multiple examples of doing loops in dedicated SQL pools online, but there is a good example in the official documentation, reproduced here in the event the link moves:

-- First, create a temporary table containing a unique row number used to identify the individual statements:
CREATE TABLE #tbl
WITH
( DISTRIBUTION = ROUND_ROBIN
)
AS
SELECT  ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Sequence
,       [name]
,       'UPDATE STATISTICS '+QUOTENAME([name]) AS sql_code
FROM    sys.tables
;

-- Second, initialize the variables required to perform the loop:
DECLARE @nbr_statements INT = (SELECT COUNT(*) FROM #tbl)
,       @i INT = 1
;


-- Now loop over statements executing them one at a time:
WHILE   @i <= @nbr_statements
BEGIN
    DECLARE @sql_code NVARCHAR(4000) = (SELECT sql_code FROM #tbl WHERE Sequence = @i);
    EXEC    sp_executesql @sql_code;
    SET     @i +=1;
END

Upvotes: 3

Related Questions