Reputation: 89799
I am a big fan of using apache-digester to load XML files into my object model.
I am dealing with large files that contain many duplicates (event logs), and would therefore like to String.intern() the strings for specific attributes (the ones that frequently repeat).
Since Apache-Digester reads the whole file before relinquishing control, it initially generates a lot of duplicates that eat up a lot of memory; I can then go and iterate over all my objects and intern, but I still pay the cost of using up lots of memory.
Another altenrative is to have my corresponding setProperty bean function in my object model always intern the parameter, but I use the same function from within my code on already interned strings, so that would be wasteful; besides, I don't want to introduce digester specific code into my model.
Is there a way to get Digester to intern or execute custom code before/after setting properties?
Upvotes: 1
Views: 373
Reputation: 5569
You can create your own Digester Rule to accomplish this:
public class InternRule extends BeanPropertySetterRule
{
public InternRule( String propertyName )
{
super( propertyName );
}
@Override
public void body( String namespace, String name, String text )
throws Exception
{
super.body( namespace, name, text.intern() );
}
}
Instead of doing:
digester.addBeanPropertySetter( "book/author", "author" );
You would do this:
digester.addRule( "book/author", new InternRule( "author" ) );
Depending on which digester method you're using there are different classes you can subclass (SetPropertyRule, CallMethodRule, etc)
Upvotes: 3