Reputation: 4590
I have a web project which I deploy to an ec2
instance simply by pushing new commits. I use the post-recieve
git hook remotely to execute a shell-script which 'deploys' the project by checking it out into a production directory. The steps are, run npm install
on the express app, npm install
on the frontend (a create-react-app
app), then run npm run build
(which basically utilizes web-pack
to build an optimized distribution folder from my node source code).
These steps are expensive and in many cases not needed. E.G. if all I did was update a Node component in srcs/components/
then npm run build
should run, but npm install
on the server and frontend shouldn't. If all I have done is added a comment to my express app, no scripts should run.
My currently server-side deploy script looks like this:
#!/usr/bin/env bash
GIT_WORK_TREE=/home/ec2-user/absiteProd git checkout -f
### TODO: conditional NPM work
pm2 restart index
My question is then how can I use git (or grep, sed, awk, etc..) to reliably tell me when either /home/ec2-user/absiteProd/frontend/package.json
,
/home/ec2-user/absiteProd/server/package.json
or anything in 'home/ec2-user/absiteProd/frontend/sources` has changed?
Currently I'm having some success with:
if `git log --stat -n 1` | grep --quite frontend/src/* ; then
cd home/ec2-user/frontend
npm run build
fi
But since this seems like such a common requirement in app deployment, I feel like there must be a simpler way?
Upvotes: 1
Views: 1354
Reputation: 1328572
You can find a similar need in this thread:
How do I find a last commit for the given directory inside the repository?
I want to avoid rebuilding the specific part of the project if there were no changes in it since the last build, so I need to find the sha of the last time the directory was changed.
You can compare the last commit where an element is modified, using git rev-list
:
git rev-list -1 HEAD -- frontend/package.json
git rev-list -1 HEAD -- absiteProd/server/package.json
git rev-list -1 HEAD -- frontend/src
with the current HEAD SHA1 (git rev-parse
, the --verify
is optional):
git rev-parse --verify HEAD
That is:
h=$(git rev-parse --verify HEAD)
b=false
if [[ "$(git rev-list -1 HEAD -- frontend/package.json)" == "${h}" ]]; then b=true; fi
if [[ "$(git rev-list -1 HEAD -- frontend/package.json)" == "${h}" ]]; then b=true; fi
if [[ "$(git rev-list -1 HEAD -- frontend/package.json)" == "${h}" ]]; then b=true; fi
if !b; then exit 0; fi
cd home/ec2-user/frontend
npm run build
Upvotes: 1
Reputation: 489698
Git does not store directories in any useful way, so you must define what you mean by "anything in" yourself (which has its advantages since you can define what you mean rather than getting stuck with someone else's useless-to-you definition, but means you must do more work).
That said, Git stores each file as a path name within each commit. Your deployment script takes some work-tree—in this case, /home/ec2-user/absiteProd
—from one state to another. Since it uses git checkout
to do so, and git checkout
does nothing special with time stamps, you now have many options with many different low-level details and subsequent consequences. Here are two obvious-ish and reasonably simple starting points:
Was /home/ec2-user/absiteProd
exactly the same as some previous commit? If so, which commit? (Commits have unique hash IDs and these are generally the things to use in scripts.) You can then have Git compare the previous commit with the new commit, using git diff --name-status
for instance. This is similar to what you are doing now, but better.
If your deployment script is a post-receive script, you already have both the old and new hash IDs of the reference, which you have read from standard input. Hence the set of files changed, with their statuses, between those two commits, is:
git diff-tree -r --name-status $oldhash $newhash
If git checkout
wrote on any file(s), those files will have "now" as their modify-time time-stamps, since git checkout
just lets the system's time apply to updated files. Can you use this? As long as you never deploy more than twice in a single second, you could combine this with the make
build-system, which builds files based on time-stamps.
If make
is suitable here, it is probably the best choice, except for its maximum of one-per-second deployment (or whatever your underlying OS has for time stamp resolution on files). You can just declare that whatever the output file(s) is/are, they depend on the corresponding input file(s), and give the recipe to build the output(s) from the input(s) and run make
.
Upvotes: 1