Danger
Danger

Reputation: 2113

Does Java have an efficient way to perform multiple regex replaceAll operations on a StringBuilder?

I'd like to use something like StringBuilder to hold a string, and then perform a large number of regex replaceAll operations on it, in an efficient way. I'd like to leverage StringBuilder's variable sized array and prevent temporary string allocations. That is, I'd like the regex replaceAll operation to mutate the array held by StringBuilder as needed, without allocating temporary strings. How can I do this?

Unfortunately StringBuilder does not have a built-in method to do this. It only has a replace() method without regex, and I can't see a way to do this without effectively replacing the entire StringBuilder buffer with a newly allocated String using Matcher, which I'd like to avoid.

Upvotes: 0

Views: 176

Answers (1)

Edwin Buck
Edwin Buck

Reputation: 70909

Regex doesn't create extra Strings. It verifies that strings match (or don't match) a pattern.

Capture groups return back Strings, but Strings in Java are not mutable, so you can't have them be represented by a mutable storage area, or even part of a mutable storage area.

Also, a Regex operation is not a single step (even if it appears to be in the code), but a run of a state machine with the string as input. Java is multi-threaded, and the state machine would not work correctly if the data is being modified as the machine runs over it. To fix this would require locking the buffer, which would incur additional overheads.

Between the overhead of a lock and the overhead of having a different String object, the overhead of a lock would make the savings in maintaining two immutable objects negative. In short, you'd expend far more cpu cycles obtaining the lock than you'd save in not having a dozen (or likely even a hundred) additional strings.

Finally, the entire JVM contains string specific optimizations. If you wanted a mutable string, those optimizations wouldn't work, and would cause bizarre behavior in one of the more commonly used data types within the JVM.

Upvotes: 1

Related Questions