Jason
Jason

Reputation: 11363

What is the Big-O of String.contains() in Java?

I'm working on a project, and need to optimize the running time. Is String.contains() runtime the same as TreeSet.contains(), which is O(logN)?

The reason I'm asking is I'm building a TreeMap<String, TreeSet<Song>>, where Songs contain a String of lyrics. Depending on the efficiency, I am considering including a Set of the lyric words inside the Song and running searches on that rather than the String.

Upvotes: 41

Views: 58538

Answers (3)

Nishant Thapliyal
Nishant Thapliyal

Reputation: 1670

.contains() definitely uses naive approach and equivalent to O(nm) time complexity.

  • Boyer-moore takes O(nm) time in the worst case.
  • KMP takes O(n) time in the worst case.

In one of the problems related to pattern matching, I used .contains() and it took 70 ms whereas replacing that line with patternSearch() //KMP search reduced the time to 14 ms.

enter image description here

Java source code | KMP search vs .contains()

Upvotes: 8

Mark Byers
Mark Byers

Reputation: 838926

One of the best known algorithms is the Boyer-Moore string searching algorithm which is O(n) although it can give sublinear performance in the best case.

Which algorithm is used in Java depends on which implemetation you download. It seems that for example OpenJDK uses a naive algorithm that runs in O(nm) and linear performance in the best case. See lines 1770-1806 here.

Upvotes: 39

dty
dty

Reputation: 18998

You could also try using a Trie instead of a TreeMap (although Java has no built-in Trie implementation)

Upvotes: 0

Related Questions