user9191494
user9191494

Reputation:

I look for a faster java algorithm

I look for a faster algorithm. I try to verify that a word exist in a dictionary.

Here is my code in java.

public class Searcher {

public static void main(String[] args){

    File file = new File("pathToFile");

    Scanner scanner = null;

    try{
        scanner = new Scanner(file);
    }catch(FileNotFoundException e){
        System.err.println("Le fichier n'a pas ete trouve");
    }

    //Word to look for.
    String word = "mot";
    //indicator of word existence.
    boolean nonExistence = true;

    while(scanner.hasNext()){
        if(Pattern.matches(word, scanner.next())){
            System.out.println("\"" + word + "\"" + " est un mot francais.");
            nonExistence = false;
            break;
        }
    }

    if(nonExistence){
        System.out.println("\'" + word + "\'" + " n'est pas un mot francais.");
    }

}

}

I would like to do not have to explore the whole file. Thanks.

Upvotes: 0

Views: 157

Answers (3)

Said A. Sryheni
Said A. Sryheni

Reputation: 757

I think it depends on the size of your file. If you are performing many search operations, and you can load the file into RAM and perform search operations there, here are a few ideas that come to my mind.

The first idea is a little complex but is really a powerful way to do the search. You can build a Trie Tree. This way your search complexity will be reduced to the length of the word you are searching for, rather than the size of the file. This solution is helpful when you need to search for existing words, and even add new ones to your dictionary, because both operations have a complexity O(|WORD|), where |WORD| is the length of the word you are adding/searching for.

Another solution would be to store the words in an array in lexicographical order, and use binary search to find the word you are searching for. Of course this solution is only helpful when your searching operations are a lot more frequent than the operation of adding a new word. The complexity of searching for a word is equal to O(|LEN| * Log(N)), where |LEN| is the approximate length of a single word in your dictionary, and N is the number of words in your dictionary. However adding a new word is quite expensive, since you will need to insert it in its correct location, and perform a shifting operation on the words that follow it.

If your file is pretty large and loading it to RAM is not an option, and based on a quick seach (check this question for example), I believe all programming languages (including java) don't contain a way to read specific lines from a file, and sequential scanning is the only way to do so, which means you can only scan your file sequentially searching for your word in the same way you are doing now.

Upvotes: 4

Sojimanatsu
Sojimanatsu

Reputation: 601

Well, It actually looks simple to me. I did not try the code but here is the idea:

You dont want to look for whole file right? But the word you specify is clear.No matter what is "Look" "Take" "Get" I dont know something;

Add more constraint to your code that getting the first letter of your word and search inside the dictionary only the words that also starts with that letter. (Java has libraries and easy iterations for that)

For example if your word is "Take" you can say something like search index find the words starting by "t" (ignore the case) depends on your dictionary.

With that you dont have to look for whole file and it becomes just faster.

Upvotes: 1

Oleg Cherednik
Oleg Cherednik

Reputation: 18245

Go to Coursera: Algorithms on Strings - Suffix Trees. This is what exactly what you're looking for. There you can find couple videos and slides (it is free). These materials help you to realise the problem and you will be able easily to implement it then.

In a breaf way: the most effective way is to build a Suffix Tree of a text and then match your patterns with this Suffix Tree.

Upvotes: 1

Related Questions