formn
formn

Reputation: 107

Word Count from a file

I'm at the start of writing my program (this is for a class) and I'm running into trouble to just write it down. Here's a list of goals I am hoping to meet.

  1. It is a method given a .txt file (using java.io.File)
  2. It needs to read the file and split the words, duplicates are allowed. (I plan to use String.split and util.regex.Pattern to work out whitespace and punctuation)
  3. I'm aiming to put the words in a 1D array and then just find the length of the array.

The problem I'm running into is parsing the txt file. I was told in class that Scanner can, but I'm not finding it while R(ing)TFM. I guess I'm asking for some directions in the API that helps me understand how to read a file with Scanner. Once I can get it to put each word in the array I should be in the clear.

EDIT: I figured out what I needed to do thanks to everyone's help and input. My final snippet ends up looking like this, should anyone in the future come across this question.

Scanner in = new Scanner(file).useDelimiter(" ");
ArrayList<String> prepwords=new ArrayList<String>();
while(in.hasNext())
prepwords.add(in.next());
return prepwords; //returns an ArrayList without spaces but still has punctuation

I had to throw IOExceptions since java hates not being sure a file exists, so if you run into "FileNotFoundException", you need to import and throw IOException. At the very least this worked for me. Thank you everyone for your input!

Upvotes: 2

Views: 1652

Answers (3)

Something simple:

//variables you need    
File file = new File("someTextFile.txt");//put your file here
Scanner scanFile = new Scanner(new FileReader(file));//create scanner
ArrayList<String> words = new ArrayList<String>();//just a place to put the words
String theWord;//temporary variable for words

//loop through file
//this looks at the .txt file for every word (moreover, looks for white spaces)
while (scanFile.hasNext())//check if there is another word
{   
    theWord = scanFile.next();//get next word
    words.add(theWord);//add word to list
    //if you dont want to add the word to the list
    //you can easily do your split logic here
}

//print the list of words
System.out.println("Total amount of words is:  " + words.size);
for(int i = 0; i<words.size(); i++)
{
    System.out.println("Word at " + i + ":  " + words.get(i));
}

Source:

http://www.dreamincode.net/forums/topic/229265-reading-in-words-from-text-file-using-scanner/

Upvotes: 0

DwB
DwB

Reputation: 38300

Here is a link to the JSE 6.0 Scanner API

Here is the info you need to complete your project:

1. Use the Scanner(File) constructor.
2. Use a loop that is, essentially this:
    a. Scanner blam = new Scanner(theInputFile);
    b. Map<String, Integer> wordMap = new HashMap<String, Integer>();
    c. Set<String> wordSet = new HashSet<String>();
    d. while (blam.hasNextLine)
    e. String nextLine = blam.nextLine();
    f. Split nextLine into words (head about the read String.split() method).
    g. If you need a count of words: for each word on the line, check if the word is in the map, if it is, increment the count.  If not, add it to the map.  This uses the wordMap (you dont need wordSet for this solution).
    h. If you just need to track the words, add each word on the line to the set.  This uses the wordSet (you dont need wordMap for this solution).
3. that is all.

If you dont need either the map or the set, then use a List<String> and either an ArrayList or a LinkedList. If you dont need random access to the words, LinkedList is the way to go.

Upvotes: 1

Cruncher
Cruncher

Reputation: 7812

BufferedReader input = new BufferedReader(new FileReader(filename));

input.readLine();

This is what I use to read from files. Note that you have to handle the IOException

Upvotes: 3

Related Questions