Reputation: 20571
Short version
Need to distribute nightly builds to 70+ people each morning, would like to use git to load balance the transfer, and would like to know if there are tips, pitfalls, or flaws with the idea before I start designing the system.
Long version
Each morning we need to distribute our nightly build to the studio of 70+ people (artists, testers, programmers, production etc). Up until now we have copied the build to a server and have written a sync program that fetches it (using Robocopy underneath); even with setting up mirrors the transfer speed is unacceptably slow with it taking up-to an hour or longer to sync at peak times (off-peak times are roughly 15 minutes) which points to being hardware I/O bottleneck.
A brilliant (though definitely not original) idea that I had was to distribute the load throughout the studio. After investigating writing a client using the infamous bit-torrent protocol, another thought occurred to me that I could just use git as by design it would give us distributing the build and revision management with the added benefit of being server less.
Questions
How do you get started using git? I have experience with centrally located source-control systems like Perforce and SVN. Reading the documentation, it appears that all you need to do is run git init path\\to\folder
and then on another machine run git clone url
?
Where do I get the url
for the above git clone
command? Can I define? I find concept of having a url strange as the git does not have a central server - or does it? e.g. similar to a bit-torrent tracker?
What would be the better option to identify builds, use changelist numbers or labels?
Is it possible to limit the number of revisions stored? This would be useful as in addition to the nightly builds we also have several CI builds throughout the day that we want to distribute, however it does not make sense to have infinite number revisions lingering around. In Perforce you can limit the revisions by setting a property.
Upvotes: 9
Views: 1364
Reputation: 1147
Yes, that's the essence of it. Create a repository somewhere and then you can clone it from somewhere else.
The repository you init in 1) have to be accessible from the machine you're cloning to. Git is server-less but every repository have to get their stuff from somewhere. So all of your 70+ machines will have to know where they should get the new build. And if you want to distribute the load you'll have to figure out a stategy on who gets their update from whom.
The URL could be a filepath, a networkpath, an SSH host with path, etc.
Tags would work well.
You could perhaps rebase the git repo to remove old revisions. See Completely remove (old) git commits from history
However, I don't think it will solve your original problem, distributing the load. Other avenues should be investigated. Multicast copying for instance, perhaps MQcast and MQcatch can help you?
Upvotes: 1
Reputation: 2861
I don't think that git will be really helpful in your situation. Yes, it is distributed, but not in case of "distributing something to more people as possible". It does not help you to reduce bandwith load, also there will be an additional load if you will use git over ssh. May be you should take step back and give another chance to bittorrent protocol.
Upvotes: 4
Reputation: 1325966
Note: putting binaries in a distributed repo isn't a solution that will scale well in time, the repo getting bigger and bigger. (you have alternative git setups here).
The advantage is to compute the delta by the central Git repo (which will be must faster than a robocopy) and send said delta as an answer to a git fetch
done by a downstream repo.
Upvotes: 1