Reputation: 379
On Compute Engine we can do Snapshots, which are basically backups. Could you try to figure out how we could create a script to do automated snapshots every day and keep like 4 snapshots, so basically after we have 4, delete the oldest one. This is the only concern of mine on Google Cloud is not having scheduled backups of the server, otherwise I love Compute Engine, its much easier than Amazon to use and its cheaper.
Upvotes: 21
Views: 24157
Reputation: 6201
Update Google Cloud has scheduled backups that can be configured per disk. See Creating scheduled snapshots for persistent disk in the Google Cloud documentation.
Original Answer
Documentation is pretty clear about how to do it:
gcloud compute disks snapshot DISK
Note, that
Snapshots are always created based on the last successful snapshot taken
And before you will remove any of your snapshots -- take a look on that diagram:
More information about API.
Upvotes: 12
Reputation: 5920
in my example, I have a maintenance window to create a snapshot for MySQL. it assumes that the service account has permission to execute gcloud snapshots commands. hope it helps:
#!/bin/bash
days_to_keep=7
disk=`curl -s "http://metadata.google.internal/computeMetadata/v1/instance/disks/1/device-name" -H "Metadata-Flavor: Google"`
zone=`curl -s "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor: Google"`
project=`curl -s "http://metadata.google.internal/computeMetadata/v1/project/project-id" -H "Metadata-Flavor: Google"`
zone=`basename ${zone}`
storage_location=`echo ${zone}|sed 's/-[a-z]$//'`
systemctl stop mysqld
sleep 5
# flush file system buffers
sync
# create snapshot
gcloud -q compute disks snapshot ${disk} --project=${project} --snapshot-names=${disk}-$(date +%s) --zone=${zone} --storage-location=${storage_location}
systemctl start mysqld
delete_date=$(date -d "-${days_to_keep} days" "+%Y-%m-%d")
# get list of snapshots to delete
to_del=$(gcloud compute snapshots list --filter="name ~ ${disk}* AND creationTimestamp<$delete_date" --format "csv[no-heading](name)")
# delete bulk of old snapshots
if [[ ! -z ${to_del} ]]
then
gcloud compute snapshots delete -q ${to_del}
fi
Upvotes: 1
Reputation: 180
There is now a feature called "Snapshot Schedule" available in GCP.
It still seems to be in Beta, and there is not much documentation on this feature yet. But it is a straight forward process to enable it. First you create a snapshot schedule and can assign it to persistent disks after you set it up.
See also the command line reference to create a schedule with corresponding the gcloud command:
gcloud beta compute resource-policies create-snapshot-schedule
To assign the schedule to a persistent disk you can use the command
gcloud beta compute disks add-resource-policies
https://cloud.google.com/sdk/gcloud/reference/beta/compute/disks/add-resource-policies
Update 2019-02-15: Since yesterday there is a blog announcement about the scheduled snapshots feature and also a page in the Compute Engine documentation for scheduled snapshots.
Upvotes: 5
Reputation: 374
Also, at the time of writing this, Windows instances support Volume Shadow Copy Service (VSS), but Linux instances do not.
Hence, you can safely snapshot Windows drives whilst the instance is running using the --guest-flush switch, but not so for Linux drives.
Prior to snapshotting Linux drives, you will need some other mechanism to prepare it for snapshotting i.e. freeze the drive, detach the drive or poweroff the instance.
Upvotes: 0
Reputation: 355
UPDATE:
The script has changed a lot since I first gave this answer - please see Github repo for latest code: https://github.com/jacksegal/google-compute-snapshot
ORIGINAL ANSWER:
I had the same problem, so I created a simple shell script to take a daily snapshot and delete all snapshots over 7 days: https://github.com/Forward-Action/google-compute-snapshot
#!/usr/bin/env bash
export PATH=$PATH:/usr/local/bin/:/usr/bin
#
# CREATE DAILY SNAPSHOT
#
# get the device name for this vm
DEVICE_NAME="$(curl -s "http://metadata.google.internal/computeMetadata/v1/instance/disks/0/device-name" -H "Metadata-Flavor: Google")"
# get the device id for this vm
DEVICE_ID="$(curl -s "http://metadata.google.internal/computeMetadata/v1/instance/id" -H "Metadata-Flavor: Google")"
# get the zone that this vm is in
INSTANCE_ZONE="$(curl -s "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor: Google")"
# strip out the zone from the full URI that google returns
INSTANCE_ZONE="${INSTANCE_ZONE##*/}"
# create a datetime stamp for filename
DATE_TIME="$(date "+%s")"
# create the snapshot
echo "$(gcloud compute disks snapshot ${DEVICE_NAME} --snapshot-names gcs-${DEVICE_NAME}-${DEVICE_ID}-${DATE_TIME} --zone ${INSTANCE_ZONE})"
#
# DELETE OLD SNAPSHOTS (OLDER THAN 7 DAYS)
#
# get a list of existing snapshots, that were created by this process (gcs-), for this vm disk (DEVICE_ID)
SNAPSHOT_LIST="$(gcloud compute snapshots list --regexp "(.*gcs-.*)|(.*-${DEVICE_ID}-.*)" --uri)"
# loop through the snapshots
echo "${SNAPSHOT_LIST}" | while read line ; do
# get the snapshot name from full URL that google returns
SNAPSHOT_NAME="${line##*/}"
# get the date that the snapshot was created
SNAPSHOT_DATETIME="$(gcloud compute snapshots describe ${SNAPSHOT_NAME} | grep "creationTimestamp" | cut -d " " -f 2 | tr -d \')"
# format the date
SNAPSHOT_DATETIME="$(date -d ${SNAPSHOT_DATETIME} +%Y%m%d)"
# get the expiry date for snapshot deletion (currently 7 days)
SNAPSHOT_EXPIRY="$(date -d "-7 days" +"%Y%m%d")"
# check if the snapshot is older than expiry date
if [ $SNAPSHOT_EXPIRY -ge $SNAPSHOT_DATETIME ];
then
# delete the snapshot
echo "$(gcloud compute snapshots delete ${SNAPSHOT_NAME} --quiet)"
fi
done
Upvotes: 31
Reputation: 491
My solution is slightly simpler. I want to snapshot all disks not just the primary disk.
By listing all disks in the project this handles all servers from one single script - as long as it is run within a gcloud project (and could be modified to run outside a project server too.
To tidy up older snapshots doesn't need such complex date processing as it can be handled from the gcloud command line using a filter
https://gitlab.com/alan8/google-cloud-auto-snapshot
#!/bin/bash
# loop through all disks within this project and create a snapshot
gcloud compute disks list | tail -n +2 | while read DISK_NAME ZONE c3 c4; do
gcloud compute disks snapshot $DISK_NAME --snapshot-names auto-$DISK_NAME-$(date "+%s") --zone $ZONE
done
#
# snapshots are incremental and dont need to be deleted, deleting snapshots will merge snapshots, so deleting doesn't loose anything
# having too many snapshots is unwieldy so this script deletes them after 60 days
#
gcloud compute snapshots list --filter="creationTimestamp<$(date -d "-60 days" "+%Y-%m-%d") AND (auto.*)" --uri | while read SNAPSHOT_URI; do
gcloud compute snapshots delete --quiet $SNAPSHOT_URI
done
#
Also note that for OSX users you have to use something like
$(date -j -v-60d "+%Y-%m-%d")
for the the creationTimestamp filter
Upvotes: 11
Reputation: 110
There's also a 3rd Party service called VMPower.io which can automate the capture, retention and restore of snapshots for google cloud. It isn't free but will do what you're looking for without having to code anything.
Upvotes: -1
Reputation: 1
If nothing else I know that [--set-scheduling]
is a situational gcloud flag and there's a wait [process]
that will prevent the current command from executing until that process is complete. Combine that with &&
operator (executes same-statement commands after the previous is completed), stringing this sucker together shouldn't be too hard. Just run it at startup (when you create an instance it has startup command option) and have it count time or make one of the regular maintenance functions trigger the commands. But honestly, why mix syntax if you don't have to?
This could work (don't copy/paste)
gcloud config set compute/zone wait [datetime-function] && \
gcloud compute disks snapshot snap1 snap2 snap3 \
--snapshot-names ubuntu12 ubuntu14 debian8 \
--description=\
'--format="multi(\
info:format=list always-display-title compact,\
data:format=list always-display-title compact\
)"'
In theory gcloud will set the compute/zone but will have to wait until the time specified. Because of the double ampersand (&&) the next command will not execute until after the first command is complete. I may have gone overboard on the description but I did so for the sake of showing the simplicity of it, I know it won't work as is but I also know I'm not that far off. Wow after looking at all the code one might believe we're attempting to solve the immortality sequence. I don't think working it out in a bash file is the best way. gcloud made command line for people that don't know command line. We've been taught (or learned... or haven't learned yet) to write code a proper way relative to the environment. I say we apply that here and use the CLOUD SDK to our advantage.
Upvotes: 0
Reputation: 185
Script assumes $HOSTNAME is the same as disk-name (my primary system disk assumes the same name as the the VM instance or $HOSTNAME -- (change to your liking) ultimately, wherever it says $HOSTNAME, it needs to point to the system disk on your VM.
gcloud creates incremental diff snapshots. The oldest will contain the most information. You do not need to worry about creating a full snapshot. Deleting the oldest will make the new oldest snapshot the primary that future incrementals will base from. This is all done on Google side logic -- so it is automagic to gcloud.
We have this script set to run on a cron job every hour. It creates a incremental snapshot (abt 1 to 2GB) and deletes any that are older than retention days. Google magically resizes the oldest snapshot (which was previously an incremental) to be the base snapshot. You can test this by deleting the base snapshot and refreshing the snapshot list (console.cloud.google.com) -- the "magic" occurs in the background and you may need to give it a minute or so to rebase itself. Afterwards, you'll notice the oldest snapshot is the base and it's size will reflect the used portion of the disk that you are performing the snapshot on.
#!/bin/bash
. ~/.bash_profile > /dev/null 2>&1 # source environment for cron jobs
retention=7 #days
zone=`gcloud info|grep zone:|awk -F\[ '{print $2}'|awk -F\] '{print $1}'`
date=`date +"%Y%m%d%H%M"`
expire=`date -d "-${retention} days" +"%Y%m%d%H%M"`
snapshots=`gcloud compute snapshots list --regexp "(${HOSTNAME}-.*)" --uri`
# Delete snapshots older than $expire
for line in "${snapshots[@]}"
do
snapshot=`echo ${line}|awk -F\/ '{print $10}'|awk -F\ '{print $1}'`
snapdate=`echo $snapshot|awk -F\- '{print $3}'`
if (( $snapdate <= $expire )); then
gcloud compute snapshots delete $snapshot --quiet
fi
done
# Create New Snapshot
gcloud compute disks snapshot $HOSTNAME --snapshot-name ${HOSTNAME}-${date} --zone $zone --description "$HOSTNAME Disk snapshot ${date}"
Upvotes: 2
Reputation: 730
The following is a very rude ruby script to accomplish this task. Please, consider it just as an example from which to take inspiration.
Any feedback to improve it is welcome ;-)
require 'date'
ARCHIVE = 30 # Days
DISKS = [] # The names of the disks to snapshot
FORMAT = '%y%m%d'
today = Date.today
date = today.strftime(FORMAT).to_i
limit = (today - ARCHIVE).strftime(FORMAT).to_i
# Backup
`gcloud compute disks snapshot #{DISKS.join(' ')} --snapshot-names #{DISKS.join("-#{date},")}-#{date}`
# Rotate
snapshots = []
`gcloud compute snapshots list`.split("\n").each do |row|
name = date
row.split("\s").each do |cell|
name = cell
break
end
next if name == 'NAME'
snapshots << name if name[-6, 6].to_i < limit
end
`yes | gcloud compute snapshots delete #{snapshots.join(' ')}` if snapshots.length > 0
Upvotes: 1