Naveen D
Naveen D

Reputation: 13

Azure data factory schedule daily at 6:00 AM PST

How do I schedule my Pipeline and output dataset run every day @ 6:00 AM PST; I tried below approach

},
            "scheduler": {
                "frequency": "Day",
                "interval": 1
            },
            "name": "CopyActivity-0"
        }
    ],
    "start": "2016-10-14T14:00:00Z",
    "end": "2099-12-31T08:00:00Z",

But it executes only once at 12:00 AM but I want it execute at 0600 PST daily.

Regards, Navin

Upvotes: 0

Views: 443

Answers (2)

chcook
chcook

Reputation: 111

The downside with using the AnchorDateTime tag on the scheduler is that must put that on the dataset, and if you wish to change this later, you need to delete and recreate your dataset (the same as if you decide to change your Frequency on the dataset from Daily to Hourly).

A more flexible way to achieve this (so you can change it easily if your scheduling requirements change) is as follows:

  1. Make sure your 'input' dataset schedule "style" property is set to "StartOfInterval". If you don't do this, it will wait until the day is over before running your slice (e.g. 2016-10-14 slice will run just after midnight UTC on 2016-10-15).

    "scheduler":
    {
        "frequency": "Day",
        "interval": 1,
        "style": "StartOfInterval"
    }
    
  2. In your pipeline, in the policy area for the pipeline activity that references your input dataset, use the "delay" property to set the amount of time you want the activity to wait beyond its normal schedule (in this example 6 hours)

    "policy": {
        "delay": "06:00:00"
    },
    
  3. This may be optional but it is good for clarity. Also in the pipeline activity, in the schedule section, set "Style" property to StartOfInterval

    "scheduler": {
        "frequency": "Day",
        "interval": 1,
        "style": "StartOfInterval"
    },
    
  4. Also make sure your 'output' dataset schedule "style" property is set to "StartOfInterval". If you don't do this, it will still probably wait until the day is over because the pipeline activity is also affected by the properties on the output dataset.

    "scheduler": {
        "frequency": "Day",
        "interval": 1,
        "style": "StartOfInterval"
    },
    

I find in most cases, you want to have the style: StartOfInterval property on all daily datasets that don't require the slice start / end as part of a query (e.g. copy file, select from an entire reference table, run a stored proc with no date parameters, etc).

In other cases where the dataset involves a query based on slice start / end, you would likely want to go with the still probably want to be the default value of "EndOfInterval" so it waits for the day to be over before selecting the day's data.

Upvotes: 1

pollirrata
pollirrata

Reputation: 5286

Try using anchorDateTime

"scheduler":
{
"frequency": "Day",
"interval": 1,
"anchorDateTime":"your value"
}

according to docs,

The scheduler property supports the same subproperties as the availability property in a dataset

You can find an example here

Upvotes: 0

Related Questions