Oki Kuswanda
Oki Kuswanda

Reputation: 73

Get Metadata Multiple Source File System Into Azure SQL Table

I have Multiple folder and files which from FileSystem (linked service) on Azure Data Factory. and my activity is references on link: https://www.sqlservercentral.com/articles/working-with-get-metadata-activity-in-azure-data-factory

for now I'm using process metadata FileName and LastModified per file like this:

enter image description here

and then i'm using stored-procedure on ADF like this:

    ALTER PROCEDURE [dbo].[SP_FileSystemMonitoring]
    (
        -- Add the parameters for the stored procedure here
        @FLAG int,
        @FILE_NAME nvarchar(100),
        @LAST_MODIFIED datetime
    )
    AS

BEGIN
        -- SET NOCOUNT ON added to prevent extra result sets from
        -- interfering with SELECT statements.
        SET NOCOUNT ON
    
        -- Insert statements for procedure here
              IF ( @FILE_NAME IS NOT NULL )
            BEGIN
                UPDATE [dwh].[FileSystemMonitoring]
                SET    STATUS = @FLAG,
                PROCESS_DATE = DATEADD(HH, 7, Getdate()),
                REPORT_DATE = DATEADD(hh,7,(DATEADD(dd,-1,GETDATE()))),
                LAST_MODIFIED = @LAST_MODIFIED
                WHERE  FILE_NAME = @FILE_NAME

but, I want on 1 activity can get metadata on 1 folder and then then file that folder insert to Azure SQL Database, for example

folderA/file1.txt
folderA/file2.txt

On that Azure SQL Table like this:

--------------------------
File_Name | Last_Modified 
--------------------------
file1.txt | 2021-12-19 13:45:56
file2.txt | 2021-12-18 10:23:32

I have no idea, because I'm confuse how to mapping on that sink on Azure SQL Table. Thanks before ...

Upvotes: 0

Views: 502

Answers (1)

Nicholas
Nicholas

Reputation: 632

Confused by your question, is it that you want to get the details of the file or folder from the get metadata activity? Or do you want to enumerate/store the child items of a root folder?

If you simply want to reference the items from Get Metadata, add a dynamic expression that navigates the output value to the JSON property you seek. For example:

@activity('Get Metadata Activity Name').output.lastModified
@activity('Get Metadata Activity Name').output.itemName

You can pass each of the above expressions as values to your stored procedure parameters. NOTE: 'Get Metadata Activity Name' should be renamed to the name of your activity.

The output JSON of this activity is like so and will grow depending on what you select to return in the Get Metadata activity. In my example I'm also including childItems.

{
    "exists": true,
    "lastModified": "2021-03-04T14:00:01Z",
    "itemName": "some-container-name",
    "itemType": "Folder",
    "childItems": [{
            "name": "someFilePrefix_1640264640062_24_12_2021_1640264640.csv",
            "type": "File"
        }, {
            "name": "someFilePrefix_1640286000083_24_12_2021_1640286000.csv",
            "type": "File"
        }
    ],
    "effectiveIntegrationRuntime": "DefaultIntegrationRuntime (Australia Southeast)",
    "executionDuration": 0,
    "durationInQueue": {
        "integrationRuntimeQueue": 0
    },
    "billingReference": {
        "activityType": "PipelineActivity",
        "billableDuration": [{
                "meterType": "AzureIR",
                "duration": 0.016666666666666666,
                "unit": "Hours"
            }
        ]
    }
}

if you want to store the child files, then you can either parse childItems as an nvarchar JSON value into your stored procedure and then enumerate the JSON array in SQL.

You could also use ADF an enumerate the same childItems property using a ForEach activity for each file. You simply enumerate over:

@activity('Get Metadata Activity Name').output.childItems

You can then call the SP for each file referencing the nested item as:

@item().name

You'll also still be able to reference any of the root parameters from the original get metadata activity within the ForEach activity.

Upvotes: 1

Related Questions