ojblass
ojblass

Reputation: 21620

Faster alternatives to replace method in a Java String?

The fact that the replace method returns a string object rather than replacing the contents of a given string is a little obtuse (but understandable when you know that strings are immutable in Java). I am taking a major performance hit by using a deeply nested replace in some code. Is there something I can replace it with that would make it faster?

Upvotes: 15

Views: 32108

Answers (10)

Mykhaylo Adamovych
Mykhaylo Adamovych

Reputation: 20966

Apache commons stuff, based on StringBuilder

StringUtils.replace(String text, String searchString, String replacement)

Upvotes: 0

Mykhaylo Adamovych
Mykhaylo Adamovych

Reputation: 20966

Because String.replace(CharSequence target, CharSequence replacement) has Pattern.compile, matcher, replaceAll inside, one can slightly optimize it by for using precompiled target pattern constant, like this:

private static final Pattern COMMA_REGEX = Pattern.compile(",");
...
COMMA_REGEX.matcher(value).replaceAll(replacement);

Upvotes: 1

Horcrux7
Horcrux7

Reputation: 24447

The follow code is approx. 30 times faster if there is no match and 5 times faster if there is a match.

static String fastReplace( String str, String target, String replacement ) {
    int targetLength = target.length();
    if( targetLength == 0 ) {
        return str;
    }
    int idx2 = str.indexOf( target );
    if( idx2 < 0 ) {
        return str;
    }
    StringBuilder buffer = new StringBuilder( targetLength > replacement.length() ? str.length() : str.length() * 2 );
    int idx1 = 0;
    do {
        buffer.append( str, idx1, idx2 );
        buffer.append( replacement );
        idx1 = idx2 + targetLength;
        idx2 = str.indexOf( target, idx1 );
    } while( idx2 > 0 );
    buffer.append( str, idx1, str.length() );
    return buffer.toString();
}

Upvotes: 8

When you're replacing single characters, consider iterating over your character array but replace characters by using a (pre-created) HashMap<Character, Character>().

I use this strategy to convert an integer exponent string by unicode superscript characters.

It's about twice as fast compared to String.replace(char, char). Note that the time associated to creating the hash map isn't included in this comparison.

Upvotes: 0

Leo
Leo

Reputation: 1240

Adding to the @paxdiablo answer, here's a sample implementation of a replaceAll using StringBuffers that is a ~3.7 times faster than String.replaceAll():

Code:

public static String replaceAll(final String str, final String searchChars, String replaceChars)
{
  if ("".equals(str) || "".equals(searchChars) || searchChars.equals(replaceChars))
  {
    return str;
  }
  if (replaceChars == null)
  {
    replaceChars = "";
  }
  final int strLength = str.length();
  final int searchCharsLength = searchChars.length();
  StringBuilder buf = new StringBuilder(str);
  boolean modified = false;
  for (int i = 0; i < strLength; i++)
  {
    int start = buf.indexOf(searchChars, i);

    if (start == -1)
    {
      if (i == 0)
      {
        return str;
      }
      return buf.toString();
    }
    buf = buf.replace(start, start + searchCharsLength, replaceChars);
    modified = true;

  }
  if (!modified)
  {
    return str;
  }
  else
  {
    return buf.toString();
  }
}

Test Case -- the output is the following (Delta1 = 1917009502; Delta2 =7241000026):

@Test
public void testReplaceAll() 
{
  String origStr = "1234567890-1234567890-";

  String replacement1 =  StringReplacer.replaceAll(origStr, "0", "a");
  String expectedRep1 = "123456789a-123456789a-";

  String replacement2 =  StringReplacer.replaceAll(origStr, "0", "ab");
  String expectedRep2 = "123456789ab-123456789ab-";

  String replacement3 =  StringReplacer.replaceAll(origStr, "0", "");
  String expectedRep3 = "123456789-123456789-";


  String replacement4 =  StringReplacer.replaceAll(origStr, "012", "a");
  String expectedRep4 = "1234567890-1234567890-";

  String replacement5 =  StringReplacer.replaceAll(origStr, "123", "ab");
  String expectedRep5 = "ab4567890-ab4567890-";

  String replacement6 =  StringReplacer.replaceAll(origStr, "123", "abc");
  String expectedRep6 = "abc4567890-abc4567890-";

  String replacement7 =  StringReplacer.replaceAll(origStr, "123", "abcdd");
  String expectedRep7 = "abcdd4567890-abcdd4567890-";

  String replacement8 =  StringReplacer.replaceAll(origStr, "123", "");
  String expectedRep8 = "4567890-4567890-";

  String replacement9 =  StringReplacer.replaceAll(origStr, "123", "");
  String expectedRep9 = "4567890-4567890-";

  assertEquals(replacement1, expectedRep1);
  assertEquals(replacement2, expectedRep2);
  assertEquals(replacement3, expectedRep3);
  assertEquals(replacement4, expectedRep4);
  assertEquals(replacement5, expectedRep5);
  assertEquals(replacement6, expectedRep6);
  assertEquals(replacement7, expectedRep7);
  assertEquals(replacement8, expectedRep8);
  assertEquals(replacement9, expectedRep9);

  long start1 = System.nanoTime();
  for (long i = 0; i < 10000000L; i++)
  {
    String rep =  StringReplacer.replaceAll(origStr, "123", "abcdd");
  }
  long delta1 = System.nanoTime() -start1;

  long start2= System.nanoTime();

  for (long i = 0; i < 10000000L; i++)
  {
    String rep =  origStr.replaceAll( "123", "abcdd");
  }

  long delta2 = System.nanoTime() -start1;

  assertTrue(delta1 < delta2);

  System.out.printf("Delta1 = %d; Delta2 =%d", delta1, delta2);


}

Upvotes: 2

Tim
Tim

Reputation: 11

Just get the char[] of the String and iterate through it. Use a temporary StringBuilder.

Look for the pattern you want to replace while iterating if you don't find the pattern, write the stuff you scanned to the StringBuilder, else write the replacement text to the StringBuilder.

Upvotes: 1

sb4
sb4

Reputation: 179

If you have a number of strings to replace (such as XML escape sequences), especially where the replacements are different length from the pattern, FSM lexer type algorithm seems like it might be most efficient, similar to the suggestion of processing in a stream fashion, where the output is incrementally built.

Perhaps a Matcher object could be used to do that efficiently.

Upvotes: 1

paxdiablo
paxdiablo

Reputation: 881313

This is what StringBuilder is meant for. If you're going to be doing a lot of manipulation, do it on a StringBuilder, then turn that into a String whenever you need to.

StringBuilder is described thus:

"A mutable sequence of characters. This class provides an API compatible with StringBuffer, but with no guarantee of synchronization".

It has replace (and append, insert, delete, et al) and you can use toString to morph it into a real String.

Upvotes: 22

David Nouls
David Nouls

Reputation: 1895

The previous posts are right, StringBuilder/StringBuffer are a solution.

But, you also have to question if it is a good idea to do the replace on big Strings in memory.

I often have String manipulations that are implemented as a stream, so instead of replacing it in the string and then sending it to an OutputStream, I do the replace at the moment that I send the String to the outputstream. That works much faster than any replace.

This works much faster if you want this replace to implement a template mechanism. Streaming is always faster since you consume less memory and if the clients is slow, you only need to generate at a slow pace - so it scales much better.

Upvotes: 9

Artem Barger
Artem Barger

Reputation: 41222

All string manipulation in general are very slow. Consider to use StringBuffer, it's not exactly like the String class, but have a lot in common and it's mutable as well.

Upvotes: 0

Related Questions