Dennis
Dennis

Reputation: 20571

Using Git to distribute nightly builds to a studio

Short version

Need to distribute nightly builds to 70+ people each morning, would like to use git to load balance the transfer, and would like to know if there are tips, pitfalls, or flaws with the idea before I start designing the system.

Long version

Each morning we need to distribute our nightly build to the studio of 70+ people (artists, testers, programmers, production etc). Up until now we have copied the build to a server and have written a sync program that fetches it (using Robocopy underneath); even with setting up mirrors the transfer speed is unacceptably slow with it taking up-to an hour or longer to sync at peak times (off-peak times are roughly 15 minutes) which points to being hardware I/O bottleneck.

A brilliant (though definitely not original) idea that I had was to distribute the load throughout the studio. After investigating writing a client using the infamous bit-torrent protocol, another thought occurred to me that I could just use git as by design it would give us distributing the build and revision management with the added benefit of being server less.

Questions

  1. How do you get started using git? I have experience with centrally located source-control systems like Perforce and SVN. Reading the documentation, it appears that all you need to do is run git init path\\to\folder and then on another machine run git clone url?

  2. Where do I get the url for the above git clone command? Can I define? I find concept of having a url strange as the git does not have a central server - or does it? e.g. similar to a bit-torrent tracker?

  3. What would be the better option to identify builds, use changelist numbers or labels?

  4. Is it possible to limit the number of revisions stored? This would be useful as in addition to the nightly builds we also have several CI builds throughout the day that we want to distribute, however it does not make sense to have infinite number revisions lingering around. In Perforce you can limit the revisions by setting a property.

Upvotes: 9

Views: 1364

Answers (3)

henriksen
henriksen

Reputation: 1147

  1. Yes, that's the essence of it. Create a repository somewhere and then you can clone it from somewhere else.

  2. The repository you init in 1) have to be accessible from the machine you're cloning to. Git is server-less but every repository have to get their stuff from somewhere. So all of your 70+ machines will have to know where they should get the new build. And if you want to distribute the load you'll have to figure out a stategy on who gets their update from whom.

    The URL could be a filepath, a networkpath, an SSH host with path, etc.

  3. Tags would work well.

  4. You could perhaps rebase the git repo to remove old revisions. See Completely remove (old) git commits from history

However, I don't think it will solve your original problem, distributing the load. Other avenues should be investigated. Multicast copying for instance, perhaps MQcast and MQcatch can help you?

Upvotes: 1

Dmitry Maksimov
Dmitry Maksimov

Reputation: 2861

I don't think that git will be really helpful in your situation. Yes, it is distributed, but not in case of "distributing something to more people as possible". It does not help you to reduce bandwith load, also there will be an additional load if you will use git over ssh. May be you should take step back and give another chance to bittorrent protocol.

Upvotes: 4

VonC
VonC

Reputation: 1325966

  1. you can use the file protocol ("local protocol") if all your clients have a shared access to your server.
  2. if you can make a dir or ls from your client... you got the url you need.
  3. tags: once you clone a repo, you can checkout it to a certain tag.
  4. not really, you will fetch every morning the new commits in order to get a full history.

Note: putting binaries in a distributed repo isn't a solution that will scale well in time, the repo getting bigger and bigger. (you have alternative git setups here).
The advantage is to compute the delta by the central Git repo (which will be must faster than a robocopy) and send said delta as an answer to a git fetch done by a downstream repo.

Upvotes: 1

Related Questions