opticyclic
opticyclic

Reputation: 8126

Cache Credentials During SVN Merge

A merge from a feature branch to trunk took over 45 minutes to complete. The merge included a whole lot of jars (~250MB), however, when I did it on the server with the file:// protocol the process took less than 30 seconds.

SVN is being served up by Apache over https.

The version of SVN on the server is

svn, version 1.6.12 (r955767)
   compiled Sep  3 2013, 17:49:49

My local version is

svn, version 1.7.7 (r1393599)
   compiled Oct  8 2012, 20:42:17

On checking the Apache logs I made over 10k requests and apparently each of these requests went through an authentication layer.

Is there a way to configure the server so that it caches the credentials for a period and doesn't make so many authentication requests?

I guess the tricky part is making sure the credentials are only cached for the life of single svn 'request'. If svn merge makes lots of unique individual https requests, how would you determine how long to store the credential for without adding potential security holes?

Upvotes: 0

Views: 177

Answers (2)

Ben Reser
Ben Reser

Reputation: 5765

First of all I'd strongly suggest you upgrade the server to a 1.7 or 1.8 versions since 1.7 and newer servers support an updated version of the protocol that requires fewer requests for many actions.

Second, if you're using path based authorization you probably want SVNPathAuthz short_circuit in your configuration. Without this for secondary paths (i.e. paths not in the request URI) as may happen for many recursive requests (especially log) when the authorization for those paths are run it runs back through the entire Apache httpd authentication infrastructure. With the setting instead of running the entire authentication/authorization infrastructure for httpd, we simply ask mod_authz_svn to authorize the action against the path. Running through the entire httpd infrastructure can be especially painful if you're using LDAP and it needs to go back to the LDAP server to check credentials. The only reason not to use the short_circuit setting is if you have some other authentication module that depends on the path, I've yet to see an actual setup like this in the wild though.

Finally, if you are using LDAP then I suggest you configure the caching of credentials since this can greatly speed up authentication. Apache httpd provides the mod_ldap module for this and suggest you read the documentation for it.

If you provide more details of the server side setup I might be able to give more tailored suggestions.

The comments suggesting that you not put jars in the repository are valuable, but with some configuration improvements you can help resolve some of your slowness anyway.

Upvotes: 2

David W.
David W.

Reputation: 107040

The merge included a whole lot of jars (~250MB)

That's your problem! If you go through your network via http://, you have to send those jars via http://, and that can be painfully slow. You can increase the cache size of Apache httpd, or you can setup a parallel svn:// server, but you're still sending 1/4 gigabyte of jars through the network. It's why file:// was so much faster.

You should not be storing jars in your Subversion repository. Here's why:

Version control gives you a lot of power:

  • It helps you merge differences between branches
  • It helps you follow the changes taking place.
  • It helps identify a particular change and why a particular change took place.

Storing binary files like jars provide you none of that. You can't merge binary files, and you can't track their changes.

Not only that, but version control systems usually use diffs to track changes. This saves a lot of space. Imagine a 1 kilobyte text file. In 5 revisions, six lines are changed. Instead of taking up 6K of space, only 1K plus those six changes are stored.

When you store a jar, and then a new version of that jar, you can't easily do a diff, and since jar format is zip, you can't really compress them either, store five versions of a jar in Subversion, and you store pretty close to five times the size of that jar. If a jar file is 10K, you're storing 50K of space for that jar.

So, not only are jar files taking up a lot of space, and they don't give you any power in storage, they can quickly take over your repository. I've seen sites where over 90% of a 8 gigabyte repository is nothing but compiled code and third party jars. And, the useful life of these binary files is really quite limited too. So, in these places, 80% of their Subversion repository is wasted space.

Even worse, you tend to lose where you got that jar, and what is in it. When users put in a jar called commons-beans.jar, I don't know what version that jar is, whether that jar was built by someone, and whether it was somehow munged by that person. I've see users merge two separate jars into a single jar for ease of use. If someone calls that jar commmons-beanutils-1.5.jar because it was version 1.5, it's very likely that someone will update it to version 1.7, but not change the name. (It would affect the build, you have to add and delete, there is always some reason).

So, there's a massive amount of wasted space with little benefit and almost no information. Storing jars is just plain bad news.

But your build needs jars! What should you do?

Get a jar repository like Nexus or Artifactory. Both of these repository managers are free and open source.

Once you store your jars in there, you can fetch the revision of the jar you want either through Maven, Gradel, or if you use Ant and want to keep your Ant build system, Ivy. You can also, if you don't feel like being that fancy, fetch the jars via an Ant <get/> task. If you use Jenkins, Jenkins can easily deploy the built jars for other projects to use in your Maven repository.

So, get rid of the jars. Merging will then be a simple diff between text files. Merging branches will be much quicker, and less information has to be sent over the network. If you don't want to switch to Maven, then use Ivy, or simply update your builds with the <wget> task to fetch the jars and the versions you need.

Upvotes: -1

Related Questions