Reputation: 251
There is what seems to be a plethora of version control systems. Therefore, to draw a bad conclusion, it must be easy to write one.
What are some issues that must be considered in order to write a simple file versioning system? (What are the minimum necessary functions?)
Is it a feasible task for one person?
Upvotes: 25
Views: 25719
Reputation: 59811
What could give you a good overview in a less technical manner is The Git Parable. It is a nice abstraction on the principles of git, but it gives a very good understanding what a VCS should be able to perform. All things beyond this are rather "low-level" decisions.
Upvotes: 4
Reputation: 169573
How simple?
You could arguably write a version control system with a single-line shell script, upversion.sh:
cp $WORKING_COPY $REPO/$(date +"%s")
For large binary assets, that is basically all you need! It could be improved quite easily, say by making the version folders read-only, perhaps recording metadata with each version (you could have a text file at $REPO/$(date...).meta
for example)
That sounds like a huge simplification, but it's not far of the asset-management-systems many film post-production facilities use (for example)
You really need to know what you wish to version, and why..
With large-binary assets (video, say), you need to focus on tools to visually compare versions. You also probably need to deal with dependancies ("I need image123.jpg and video321.avi to generate this image")
With code, you need to focus on things like making diff's between any two versions really easy. Also since edits to source-code are usually small (a few characters from a project with many thousands of lines), it would be horribly inefficient to copy the entire project for each version - so you only store the differences between each version (delta encoding).
To version a database, you probably want to store information on the schema, tracking new tables, or columns, or adjustments to existing ones (rather than calculating deltas of the database files, or making copies like the previous two systems)
There's no perfect way to version everything, you have to focus on doing one thing well.. Git is great for text, but not for binary files. Adobe Version Cue is great with binary files (images), but useless for text..
I suppose the things to consider can be summarised as..
Upvotes: 14
Reputation: 8990
If you're Linus Torvalds, you can write something like Git in a month.
But "a version control system" is such a vague and stretchable concept, that your question is really unanswerable.
I'd consider asking yourself what you want to achieve (learn about VCS, learn a language, ...) and then define some clear goal. It's good to have a project, but it's also good to have a reachable goal in a small amount of time. Small successes are good for your morale.
Upvotes: 9
Reputation: 1324505
Have a look in the question "core concepts" about (D)VCS.
In short, writing a VCS would involve making a decisions about each of these core concepts (Central vs. Distributed, linear vs. DAG, file centric vs. repository centric, ...)
Not a "quick" project, I believe ;)
Upvotes: 9
Reputation:
The very minimal necessary prerequisite is an exhaustive and accurate test suite. Nobody (including you) will want to use your new system unless you can demonstrate that it works, reliably and completely error free.
Upvotes: 0
Reputation: 12803
To write a proof of concept, you probably could pull it off, implementing or borrowing the tools Alan mentions.
IMHO, the most important aspect of a VCS is ease-of-use. This sounds like an odd statement, but when you think about it, hard drive space is one of the easiest IT commodities to scale horizontally, so bad compression or even real sloppy deltas are going to be tolerated. The main reason people demand improvement in versioning systems is to do common tasks more intuitively or to support more features that droves of people eventually demand but that weren't obvious before release. And since versioning tools tend to be monolithic and thoroughly integrated at a company, the cost to switch is high, and it may not be possible to support a new feature without breaking an existing repo.
Upvotes: 0
Reputation: 84169
That IS really a bad conclusion. My personal opinion here is that the problem domain is so wide and generally hard that nobody has gotten it "right" yet, thus people try to solve it over and over again, from different angles and under different assumptions.
That of course doesn't mean you shouldn't try. Just be warned that many smart people were there before you, so you should do your homework.
Upvotes: 3
Reputation: 405765
A good place to learn about version control is Eric Sink's Weblog. His most recent article is Time and Space Tradeoffs in Version Control Storage, for one example.
Another good example is his series of articles Source Control HOWTO. Yes, it's all about how to use source control, but it has a lot of information about the decisions and tradeoffs developers have to make when designing the system. The best example of this is probably his article on Repositories, where he explains different methods of storing versions. I really learned a lot from this series.
Upvotes: 31
Reputation: 20053
A simple one is doable by one person for a learning opportunity. One issue you might consider is how to efficiently store plain text deltas. A very popular delta format is the one from RCS (used by many version control programs). You might want to study it to get ideas.
Upvotes: 1
Reputation: 74242
A good delta algorithm, good compression and network efficiency.
Upvotes: 2