Andrew
Andrew

Reputation: 43123

Publish to S3 using Git?

Does anyone know how to do this? So far I haven't been able to find anything useful via Google.

I'd really like to setup a local repo and use git push to publish it to S3, the idea being to have local version control over assets but remote storage on S3.

Can this be done, and if so, how?

Upvotes: 101

Views: 64453

Answers (10)

b01
b01

Reputation: 4384

You can also do this using the AWS CLI and Git (with hooks). Verified working on Windows 10. Should work on Linux/Mac.

Setup Sync to S3 on commit

For this example we have a exisitng project in a directory "myproject".

  1. Install AWS CLI.

  2. Setup IAM programmatic access credentials (you can limit to S3 and even down to just the bucket).

  3. Configure AWS CLI with the credentials.

  4. Create the S3 bucket in AWS console, or on the CLI.

  5. Ensure the bucket is private.

  6. Make a new bare git repo of your existing git project:

    mkdir myproject.git
    cd myproject.git
    git init --bare
    

    NOTE: Using a bare repo will serve as the upstream and the bare repo will only have the changes that you want to upload to the S3 bucket and not ignored files, local git configurations, etc.

  7. Install this hook as post-update into hooks of the bare myproject.git directory.

    #!/bin/sh; C:/Program\ Files/Git/usr/bin/sh.exe
    # Sync the contents of the bare repo to an S3 bucket.
    aws s3 sync . s3://myproject/ --delete
    

    Note: Add the --delete option to make sure files that are deleted locally are deleted from the bucket. And using --exact-timestamps option can optimize uploading.

    --exact-timestamps (boolean) When syncing from S3 to local, same-sized items will be ignored only when the timestamps match exactly. The default behavior is to ignore same-sized items unless the local version is newer than the S3 version.

    --delete (boolean) Files that exist in the destination but not in the source are deleted during sync.

    See aws sync for more details and options.

  8. Update the hook with the correct S3 bucket name.

  9. Now add the bare repo as an upstream to the repo you want to sync to S3:

    cd myproject
    git remote add s3 path/to/bare/directory/myproject.git 
    

    NOTE: You can use a relative path for the path to the bare directory.

  10. Perform the initial push to the bare repo, which will copy all the committed files to the bare repo, then perform the initial S3 sync:

    git push -u s3 main
    

Testing

  1. Make changes to your repo and commit.
  2. Push the changes to s3 upstream.
  3. You should see the changes sync to the S3 bucket you specified
  4. Verify it worked by
    1. viewing the S3 bucket in the AWS Management Console
    2. downloading and examining the files that the changes are there.

References:

Upvotes: 13

Yevgeny Streltsov
Yevgeny Streltsov

Reputation: 56

This can be done with IDrive e2, idrive.com; free tier is 10GB in 2023.

Make some mount like ~/s3-bucket, then create a repo in it:

git clone --bare . ~/s3-bucket/my_git_repo.git

Then you can git push; and once setting S3 bucket mount on another host, git pull your files.

Instructions for setting a mount in S3 bucket can be found in:

  1. "A Guide on How to Mount Amazon S3 as a Drive for Cloud File Sharing" by NAVIKO Team. Very useful info, even if you are using the second reference from below.
  2. Specifically IDrive instructions: "How do I use S3FS with IDrive"

This won't work in Google Colab free teer, sudo privileges are needed for S3FS to be setup; but, if you can pay for S3 bucket to be set as public - you can access it vi HTTP w/o setting up S3FS.

Watch your micropayments.

Upvotes: 0

Preston Frasch
Preston Frasch

Reputation: 173

Perhaps use s3 sync from the awscli.

If you want to ignore the same local files as you do when you push to a remote repository, you'll want to use the --exclude flag. This was encouraged by some of the AWS internal training, and it works, but it includes everything in your folder, including pycache/any files you want to ignore unless you list them as optional arguments with that exclude flag. If you prefer this method, you can write a script with a .sh extension and have a 'uge series of --exclude flags with all files/directories you want to ignore.

aws s3 sync ./* s3://fraschp/mizzle/ 
--exclude ".git/*" 
--exclude "./pycache/*"
--exclude "*.csv"

More information about the syntax or rationale, especially about include/exclude, is available in the docs.

I like this vanilla approach because I don't have to install anything and it complies with any security considerations baked into s3 tooling.

Upvotes: 0

Sorter
Sorter

Reputation: 10220

You need JGit for it.

Just save a .jgit file in User directory with aws credentials and you can use git with s3.

Here is what your git url will look like.

amazon-s3://.jgit@mybucket/myproject.git

You can do everything you do with git with jgit.

Get a complete setup guide here.

https://metamug.com/article/jgit-host-git-repository-on-s3.html

Upvotes: -1

ChatGPT
ChatGPT

Reputation: 5617

version control your files with Github? This script (and its associated GitHub / AWS configurations) will take new commits to your repo and sync them into your S3 bucket.

https://github.com/nytlabs/github-s3-deploy

Upvotes: -1

Riceball LEE
Riceball LEE

Reputation: 1591

1 Use JGit via http://blog.spearce.org/2008/07/using-jgit-to-publish-on-amazon-s3.html

Download jgit.sh, rename it to jgit and put it in your path (for example $HOME/bin).

Setup the .jgit config file and add the following (substituting your AWS keys):

$vim ~/.jgit

accesskey: aws access key
secretkey: aws secret access key

Note, by not specifying acl: public in the .jgit file, the git files on S3 will be private (which is what we wanted). Next create an S3 bucket to store your repository in, let’s call it git-repos, and then create a git repository to upload:

s3cmd mb s3://git-repos
mkdir chef-recipes
cd chef-recipes
git init
touch README
git add README
git commit README
git remote add origin amazon-s3://.jgit@git-repos/chef-recipes.git

In the above I’m using the s3cmd command line tool to create the bucket but you can do it via the Amazon web interface as well. Now let’s push it up to S3 (notice how we use jgit whenever we interact with S3, and standard git otherwise):

jgit push origin master

Now go somewhere else (e.g. cd /tmp) and try cloning it:

jgit clone amazon-s3://.jgit@git-repos/chef-recipes.git

When it comes time to update it (because jgit doesn’t support merge or pull) you do it in 2 steps:

cd chef-recipes
jgit fetch
git merge origin/master

2 Use FUSE-based file system backed by Amazon S3

  1. Get an Amazon S3 account!

  2. Download, compile and install. (see InstallationNotes)

  3. Specify your Security Credentials (Access Key ID & Secret Access Key) by one of the following methods:

    • using the passwd_file command line option

    • setting the AWSACCESSKEYID and AWSSECRETACCESSKEY environment variables

    • using a .passwd-s3fs file in your home directory

    • using the system-wide /etc/passwd-s3fs file

    • do this

.

/usr/bin/s3fs mybucket /mnt

That's it! the contents of your amazon bucket "mybucket" should now be accessible read/write in /mnt

Upvotes: 54

koolhead17
koolhead17

Reputation: 1964

You can use mc aka Minio client, its written in Golang & available under Open Source Apache License. It is available for Mac, Linux, Windows, FreeBsd. You can use mc mirror command to achieve your requirement.

mc GNU/Linux Download

64-bit Intel from https://dl.minio.io/client/mc/release/linux-amd64/mc
32-bit Intel from https://dl.minio.io/client/mc/release/linux-386/mc
32-bit ARM from https://dl.minio.io/client/mc/release/linux-arm/mc
$ chmod +x mc
$ ./mc --help

Configuring mc for Amazon S3

$ mc config host add mys3 https://s3.amazonaws.com BKIKJAA5BMMU2RHO6IBB V7f1CwQqAcwo80UEIJEjc5gVQUSSx5ohQ9GSrr12
  • Replace with your access/secret key
  • By default mc uses signature version 4 of amazon S3.
  • mys3 is Amazon S3 alias for minio client

Mirror your github local repository/directory say name mygithub to amazon S3 bucket name mygithubbkp

$ ./mc mirror mygithub mys3/mygithubbkp

Hope it helps Disclaimer : I work for Minio

Upvotes: 3

Jayaprakash
Jayaprakash

Reputation: 1403

You can use deplybot(http://deploybot.com/) service which is free of cost for single git repository.

You can automate the deployment by choosing "automatic" in the deployment mode section.

I am using it now. It is very easy and useful.

Upvotes: 1

schickling
schickling

Reputation: 4230

git-s3 - https://github.com/schickling/git-s3

You just have to run git-s3 deploy

It comes with all benefits of a git repo and uploades/deletes just the files you've changed.
Note: Deploys aren't implicit via git push but you could achieve that via a git hook.

Upvotes: 10

scttnlsn
scttnlsn

Reputation: 3026

Dandelion is another CLI tool that will keep Git repositories in sync with S3/FTP/SFTP: http://github.com/scttnlsn/dandelion

Upvotes: 11

Related Questions