m1nkeh
m1nkeh

Reputation: 1397

ADF Dataset Availability

I've worked with Azure Data Factory since it was in preview, but some of the various combinations of flags really confuse me still..

Situation: I've a daily slice interval pipeline with a series of activities (4 in total) that are chained of two external data sources. Currently it all runs fine, but runs right at the end of the slice, i.e. midnight.

The data is actually available @ 7pm on the day of the slice, so we don't need to wait.

Solution: So, if i set all external datasources to have:

        "external": true,
        "policy": {
            "externalData": {
                "dataDelay": "-05:00:00" // i.e. 24:00 - 5:00 = 19:00
            }
        }

will this work?!

Thoughts i have:

The reason i am asking here, is that without having the ability to travel through time, this is a bit of a pain to debug via trial and error, so wnat to sense check with someone :)

Cheers!

Upvotes: 1

Views: 813

Answers (2)

Ritesh
Ritesh

Reputation: 1034

You can achieve this by using 2 additional attributes in output dataset availability section:

"availability": {
            "frequency": "Day",
            "interval": 1,
            "offset": "20:00:00",
            "style": "StartOfInterval"
        }

The above setting will trigger the pipeline @ 8PM (20:00:00) daily.

Then in Pipeline you need set the start date as [WhateverDate]T20:00:00Z

Upvotes: 0

Brian Golden
Brian Golden

Reputation: 158

You should be able to set the data to be ready at the start of the interval. The article on scheduling in ADF should answer your questions and call out the relevant properties you can set on the dataset.

Upvotes: 0

Related Questions