user3173811
user3173811

Reputation: 53

Fastest way to find substring in JAVA

lets say i have list of names.

ArrayList<String> nameslist = new ArrayList<String>();
nameslist.add("jon");
nameslist.add("david");
nameslist.add("davis");
nameslist.add("jonson");

and this list contains few thousands nameslist in it. What is the fastes way to know that this list contains names start with given name.

String name = "jon"

result should be 2.

I have tried with comparing every element of list with substring function (it works but) it is very slow specially when list is huge.

Thanks is advance.

Upvotes: 3

Views: 2892

Answers (10)

Prabhakaran Ramaswamy
Prabhakaran Ramaswamy

Reputation: 26094

You need to iterate each name and find the name within it.

String name = "jon";
int count=0;
for(String n:nameslist){
    if(n.contains(name){
        count++;
    }   
}

Upvotes: 0

Helios
Helios

Reputation: 851

You can consider Boyer–Moore string search algorithm. complexity O(n+m) worst case.

Upvotes: 0

assylias
assylias

Reputation: 328608

You could use a TreeSet for O(log n) access and write something like:

TreeSet<String> set = new TreeSet<String>();
set.add("jon");
set.add("david");
set.add("davis");
set.add("jonson");
set.add("henry");

Set<String> subset = set.tailSet("jon");
int count = 0;
for (String s : subset) {
    if (s.startsWith("jon")) count++;
    else break;
}
System.out.println("count = " + count);

which prints 2 as you expect.

Alternatively, you could use Set<String> subset = set.subSet("jon", "joo"); to return the full list of al names that start with "jon", but you need to give the first invalid entry that follows the jons (in this case: "joo").

Upvotes: 7

Dineshkumar
Dineshkumar

Reputation: 4245

I suggest TreeSet.

similar way access every element and increment count. alogorithm wise you can improve performance.

   int count = 0;
   iter = list.iterator(); 
   String name;
   while(iter.hasNext()) {
        name = iter.next();
        if (name.startsWith("jon")) {
            count++;
        }
        if(name.startsWith("k")) break;
    }

This break eliminates the checking of rest of string comparisons.

Upvotes: 0

cutze
cutze

Reputation: 93

I'd suggest you to create a Runnable for processing the list elements. Then you create an ExecutorService with fixed pool size, which processes the elements concurrently.

Rough example:

ExecutorService executor = Executors.newFixedThreadPool(5);
for (String str : coll){
    Runnable r = new StringProcessor(str);
    executor.execute(r);
}

Upvotes: 0

Daniil
Daniil

Reputation: 5780

If your strings in list are not too long you can use this cheat: store in HashSet all prefixes and your complexity will be ~O(1):

// Preprocessing
List<String> list = Arrays.asList("hello", "world"); // Your list
Set<String> set = new HashSet<>()
for(String s: list) {
  for (int i = 1; i <= s.length; i++) {
    set.add(s.substring(0, i));
  }
}

// Now you want to test
assert true == set.contains("wor") 

If it is not, you can use any full text search engine like Apache Lucene

Upvotes: 0

David Rabinowitz
David Rabinowitz

Reputation: 30448

Have a look at Trie. It's a data structure aimed to perform fast searches according to word prefixes. You may need to manipulate it a bit in order to get the number of leafs in the subtree, but in any case you do not traverse the entire list.

Example tree

Upvotes: 2

sanbhat
sanbhat

Reputation: 17622

The complexity of searching in ArrayList (or linear array) is O(n), where n is number of elements in array.

For best performance you can see Trie

Upvotes: 1

Jesper
Jesper

Reputation: 206816

What exactly does "very slow" mean?

Really the only way to do this is to loop through the list and check every element:

int count = 0;
for (String name : nameslist) {
    if (name.startsWith("jon")) {
        count++;
    }
}

System.out.println("Found: " + count);

Upvotes: 0

Maroun
Maroun

Reputation: 95968

Iterate on the ArrayList, for each element, check if it begins with jon. Time complexity is O(n).

Upvotes: 0

Related Questions