user3282276
user3282276

Reputation: 3804

Most efficient way to split sentence

I am writing an application that relies heavily on separating large strings into individual words. Because I have to deal with so many strings I am concerned about efficiency. I am using String.split to do this but I do not know if there is a more efficient way to accomplish this.

private static String[] printWords(String input) {
        String splitWords[] = input.split(" ");
        return splitWords;
    }

Upvotes: 1

Views: 978

Answers (2)

user949300
user949300

Reputation: 15729

When I timed it a few years ago, (Java 6) String.split() was significantly slower than searching for individual space characters with indexOf(), cause the former has a lot of regex baggage.

If your sentences always split on a space, (somewhat dubious?) and that performance is truly an issue (do some real tests), custom code would be faster.

Following the link provided in David Ehrmann's comment, looks like Java 7 made some speedups. My tests were with Java 6.

Upvotes: 1

maaartinus
maaartinus

Reputation: 46372

While the Sun/Oracle guys did a decent job in general, there's still room for improvement, especially because you can specialize for your concrete problem. Sometimes, you can hit a case when a huge speedup factor is achievable, when you don't rely on the JITC to do all the job perfectly out of the box. Such cases are rare, but exist.

For example String.split calls Pattern.compile for the general case and then a precomputed Pattern is a sure a win.

There's an optimization for single char patterns avoiding the regex overhead, so the possible gain is limited. Still, I'd try Guava's Splitter and a hand-crafted solution, if performance is really important.

Probably you'll find out that splitting on space is not what you want and then the gain will be bigger.

Upvotes: 1

Related Questions