burnersk
burnersk

Reputation: 3480

How to schedule a GitHub Actions nightly build, but run it only when there where recent code changes?

I want to have automatic nightly builds (daily snapshots from the development branch) with GitHub Actions.

To reduce billing costs, I want the GitHub Actions workflow to trigger (or do stuff) only when there where new commits since the last GitHub Actions nightly build workflow run.

How to schedule a GitHub Actions nightly build, but run it only when there where code changes since last nightly run?

Be aware, that there are also other GitHub Actions workflows, that shall not interfere with this nightly build.

Upvotes: 36

Views: 15640

Answers (3)

Andrzej Szombierski
Andrzej Szombierski

Reputation: 359

I'd like to suggest a different approach. I wanted to avoid explicitly checking the date of latest commits for a few reasons:

  • it's possible to push new (as in, not yet seen) commits, which have an old commit date
  • according to github docs, in extreme cases, scheduled actions may not be run at all, totally throwing off the 24-hour period calculation
  • it's difficult to exactly specify the decision threshold, if a commit is old enough or not. For example, if monday's build ran at 00:01, and tuesday's at 00:05 (due to some github queuing), it could be incorrectly skipped if a new commit was added on monday at 00:03. Of course it's possible to add a safety margin at the cost of false positives.

So my approach is to explicitly check the SHA hash of the last revision built by the workflow. I access a github API to check the last completed run of this specific workflow and compare it to the current github.sha. The check only takes a few seconds, and the rest of the workflow is skipped only if the values are equal.

Below is a sample using octokit/request-action for the API call.

Notes:

  • I've had an issue with using GITHUB_TOKEN for the API call, but others report that it works fine.
  • the workflow is named daily.yml and this name is used as an identifier in the API call
  • the example uses 'reusable workflows', but the same principle can be used with a regular 'inline' workflow
  • the same test logic will work regardless of schedule (daily/weekly/custom)
  • if a workflow run is cancelled, it's still counted as completed, so a new build will not be triggered until new commits are added
on:
  schedule:
    - cron: "30 01 * * *"

jobs:
  check:
    runs-on: 'ubuntu-latest'
    steps:
    - uses: octokit/[email protected]
      id: check_last_run
      with:
        route: GET /repos/${{github.repository}}/actions/workflows/daily.yml/runs?per_page=1&status=completed
      env:
        GITHUB_TOKEN: ${{ secrets.MY_TOKEN }}

    - run: "echo Last daily build: ${{ fromJson(steps.check_last_run.outputs.data).workflow_runs[0].head_sha }}"

    outputs:
      last_sha: ${{ fromJson(steps.check_last_run.outputs.data).workflow_runs[0].head_sha }}

  build:
    needs: [check]
    if: needs.check.outputs.last_sha != github.sha
    uses: ./.github/workflows/dothebuild.yml

    secrets: inherit

Upvotes: 5

Mehdi Chaouch
Mehdi Chaouch

Reputation: 455

You can do something like this:

  1. first add this job:
  check_date:
    runs-on: ubuntu-latest
    name: Check latest commit
    outputs:
      should_run: ${{ steps.should_run.outputs.should_run }}
    steps:
      - uses: actions/checkout@v2
      - name: print latest_commit
        run: echo ${{ github.sha }}

      - id: should_run
        continue-on-error: true
        name: check latest commit is less than a day
        if: ${{ github.event_name == 'schedule' }}
        run: test -z $(git rev-list  --after="24 hours"  ${{ github.sha }}) && echo "::set-output name=should_run::false"
  1. add this to all other jobs in your workflow:
    needs: check_date
    if: ${{ needs.check_date.outputs.should_run != 'false' }}

for example:

  do_something:
    needs: check_date
    if: ${{ needs.check_date.outputs.should_run != 'false' }}
    runs-on: windows-latest
    name: do something.
    steps:
      - uses: actions/checkout@v2

source

Upvotes: 28

Samira
Samira

Reputation: 9761

I have a working solution, which is slightly different than your case, but it shouldn't be hard to tweak. Main goal is exactly the same - do not waste CI time on daily runs if it's not required.

While it's not possible (AFAIK) to limit schedule to not run at all, you can lower workflow execution time by running a small shell script as very first step, even before checking out repository. Second part is to disable all other steps if repository had no new commits/no things to run.

Full example, discussed later piece by piece, and how it could be applied to your use case.

TL;DR - bash, curl, jq.

  - name:  Activity check
    run:   |
           :
           curl -sL https://api.github.com/repos/$GITHUB_REPOSITORY/commits | jq -r '[.[] | select(.author.login != "${{ secrets.ANTALASKAYA_LOGIN }}")][0]' > $HOME/commit.json
           date="$(jq -r '.commit.author.date' $HOME/commit.json)"
           timestamp=$(date --utc -d "$date" +%s)
           author="$(jq -r '.commit.author.name' $HOME/commit.json)"
           url="$(jq -r '.html_url' $HOME/commit.json)"
           days=$(( ( $(date --utc +%s) - $timestamp ) / 86400 ))
           rm -f $HOME/commit.json
           echo "Repository activity : $timestamp $author $url"
           alive=0
           if [ "${{ github.event_name }}" == "repository_dispatch" ]; then
              echo "[WARNING] Ignoring activity limits : workflow triggered manually"
              alive=1
           else
              if [ $days -gt 2 ]; then
                 echo "[WARNING] Repository activity : $days days ago"
              fi
              if [ $days -lt 8 ]; then
                 echo Repository active : $days days
                 alive=1
              else
                 echo "[WARNING] Repository not updated : event<${{ github.event_name }}> not allowed to modify stale repository"
              fi
           fi
           if [ $alive -eq 1 ]; then
              echo ::set-env name=GHA_REPO_ALIVE::true
           fi
    shell: bash

At start, i'm using GitHub API to get last non-automagic commit (and save result to .json). In my case, all "nightly" builds commits results back to repository by dedicated bot account, so it's easy to filter out.

curl -sL https://api.github.com/repos/$GITHUB_REPOSITORY/commits | jq -r '[.[] | select(.author.login != "${{ secrets.ANTALASKAYA_LOGIN }}")][0]' > $HOME/commit.json

Next, i'm extracting timestamp (and few other things) of last commit and convert it to elapsed days. In your case you'll most likely want to uses hours here instead.

date="$(jq -r '.commit.author.date' $HOME/commit.json)"
timestamp=$(date --utc -d "$date" +%s)
author="$(jq -r '.commit.author.name' $HOME/commit.json)"
url="$(jq -r '.html_url' $HOME/commit.json)"
days=$(( ( $(date --utc +%s) - $timestamp ) / 86400 ))
rm -f $HOME/commit.json
echo "Repository activity : $timestamp $author $url"

There's a few different scenarios when workflow can run (push with workflow file changed, repository_dispatch, schedule), so i'm keeping final activity check result as local var which is checked later. Assumes repository doesn't need updates by default.

alive=0

Next goes repository_dispatch handling which allows to trigger schedule manually; this will force workflow to run ignoring any limits.

if [ "${{ github.event_name }}" == "repository_dispatch" ]; then
   echo "[WARNING] Ignoring activity limits : workflow triggered manually"
   alive=1
else

On 3rd day without automated commits, i'm adding entry in log, just for fun.

if [ $days -gt 2 ]; then
   echo "[WARNING] Repository activity : $days days ago"
fi

If last commit was within last week, mark repository as active, otherwise do nothing.

if [ $days -lt 8 ]; then
   echo Repository active : $days days
   alive=1
else
    echo "[WARNING] Repository not updated : event<${{ github.event_name }}> not allowed to modify stale repository"
fi

At long last, save local variable as global one if there is work to be done. It is important to use ::set-env (or ::set-output) here so variable can be checked before step is executed.

if [ $alive -eq 1 ]; then
   echo ::set-env name=GHA_REPO_ALIVE::true
fi

All steps after activity check should check this global variable before doing anything to save time and/or money.

- name: Clone
  if:   env.GHA_REPO_ALIVE == 'true'
  uses: actions/checkout@v2

In wild:


Now about adopting such solution to your case:

If you are not commiting back results, you can simplify first part by grabbing last commit (despite of author) from API and check elapsed hours. Repository should be marked as active if there was any commit in last 24h.

If you just want to run build, you could ignore parts checking repository_dispatch or push scenarios. However, i found it pretty useful to have some non-schedule trigger for running build without waiting; i'd highly recommend to keep that for future tweaks.

Few ms could be saved by skipping author/url extraction and disable logging ;)

There are probably actions around which provides same functionality, but i feel shell script + API always will be faster. There's also a chance they would do exactly same thing, just "wasting" extra time needed to download and execute action.

Upvotes: 6

Related Questions