Reputation: 737
I am writing a java program to search for a word in a text file containing a list of words in the dictionary. As you may now, this file contains about 300,000 words. I was able to come up with a program that can iterate through the words comparing each word with the input word (the word I am searching for). The problem is that this process takes a lot of time to find a word especially if the word starts with the last alphabets like x, y, or z. I want something more efficient that can find a word almost instantly. Here is my code:
import java.io.IOException;
import java.io.InputStreamReader;
public class ReadFile
{
public static void main(String[] args) throws IOException
{
ReadFile rf = new ReadFile();
rf.searchWord(args[0]);
}
private void searchWord(String token) throws IOException
{
InputStreamReader reader = new InputStreamReader(
getClass().getResourceAsStream("sowpods.txt"));
String line = null;
// Read a single line from the file. null represents the EOF.
while((line = readLine(reader)) != null && !line.equals(token))
{
System.out.println(line);
}
if(line != null && line.equals(token))
{
System.out.println(token + " WAS FOUND.");
}
else if(line != null && !line.equals(token))
{
System.out.println(token + " WAS NOT FOUND.");
}
else
{
System.out.println(token + " WAS NOT FOUND.");
}
reader.close();
}
private String readLine(InputStreamReader reader) throws IOException
{
// Test whether the end of file has been reached. If so, return null.
int readChar = reader.read();
if(readChar == -1)
{
return null;
}
StringBuffer string = new StringBuffer("");
// Read until end of file or new line
while(readChar != -1 && readChar != '\n')
{
// Append the read character to the string. Some operating systems
// such as Microsoft Windows prepend newline character ('\n') with
// carriage return ('\r'). This is part of the newline character
// and therefore an exception that should not be appended to the
// string.
if(readChar != '\r')
{
string.append((char) readChar);
}
// Read the next character
readChar = reader.read();
}
return string.toString();
}
}
Please also note that I would like to use this program in a Java ME environment. Any help would be highly appreciated thanks - Jevison7x.
Upvotes: 1
Views: 11917
Reputation: 56809
You can use fgrep
(fgrep
is activated by -F
to grep
) (Linux man page of fgrep):
grep -F -f dictionary.txt inputfile.txt
The dictionary file should contain the words one on each line.
Not sure if it is still accurate, but Wikipedia article on grep mentions the use of Aho-Corasick algorithm in fgrep
, which is an algorithm that builds an automata based on a fixed dictionary for quick string matching.
Anyway, you can have a look at the list of string searching algorithms on a finite set of patterns on Wikipedia. These are the more efficient ones to work with when searching for words in dictionary.
Upvotes: 1