mkki
mkki

Reputation: 313

Count Word Pairs Java

I have this programming assignment and it is the first time in our class that we are writing code in Java. I have asked my instructor and could not get any help.

The program needs to count word pairs from a file, and display them like this:

abc:
   hec, 1

That means that there was only one time in the text file that "abc" was followed by "hec". I have to use the Collections Framework in java. Here is what I have so far.

 import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.ArrayList;

// By default, this code will get its input data from the Java standard input,
// java.lang.System.in. To allow input to come from a file instead, which can be
// useful when debugging your code, you can provide a file name as the first
// command line argument. When you do this, the input data will come from the
// named file instead. If the input file is in the project directory, you will
// not need to provide any path information.
//
// In BlueJ, specify the command line argument when you call main().
//
// In Eclipse, specify the command line argument in the project's "Run Configuration."

public class Assignment1
{
    // returns an InputStream that gets data from the named file
    private static InputStream getFileInputStream(String fileName)
    {
    InputStream inputStream;

    try {
        inputStream = new FileInputStream(new File(fileName));
    }
    catch (FileNotFoundException e) {       // no file with this name exists
        System.err.println(e.getMessage());
        inputStream = null;
    }
    return inputStream;
    }

    public static void main(String[] args)
    {
    // Create an input stream for reading the data.  The default is
    // System.in (which is the keyboard).  If there is an arg provided
    // on the command line then we'll use the file instead.

    InputStream in = System.in;
    if (args.length >= 1) {
        in = getFileInputStream(args[0]);

    }

    // Now that we know where the data is coming from we'll start processing.  
    // Notice that getFileInputStream could have generated an error and left "in"
    // as null.  We should check that here and avoid trying to process the stream
    // data if there was an error.

    if (in != null) {

        // Using a Scanner object to read one word at a time from the input stream.

        @SuppressWarnings("resource")
        Scanner sc = new Scanner(in);   
        String word;

        System.out.printf("CS261 - Assignment 1 - Matheus Konzen Iser%n%n");

        // Continue getting words until we reach the end of input 
        List<String> inputWords = new ArrayList<String>();
        Map<String, List<String>> result = new HashMap<String, List<String>>();

        while (sc.hasNext()) {  
        word = sc.next();       
        if (!word.equals("---")) {

            // do something with each word in the input
            // replace this line with your code (probably more than one line of code)

            inputWords.add(word);
        }

            for(int i = 0; i < inputWords.size() - 1; i++){

                // Create references to this word and next word:
                String thisWord = inputWords.get(i);
                String nextWord = inputWords.get(i+1);

                // If this word is not in the result Map yet,
                // then add it and create a new empy list for it.
                if(!result.containsKey(thisWord)){
                    result.put(thisWord, new ArrayList<String>());
                }

                // Add nextWord to the list of adjacent words to thisWord:
                result.get(thisWord).add(nextWord);
            }


            //OUTPUT
            for(Entry e : result.entrySet()){
                System.out.println(e.getKey() + ":");

                // Count the number of unique instances in the list:
                Map<String, Integer>count = new HashMap<String, Integer>();
                List<String>words = (List)e.getValue();
                for(String s : words){
                    if(!count.containsKey(s)){
                        count.put(s, 1);
                    }
                    else{
                        count.put(s, count.get(s) + 1);
                    }
                }

                // Print the occurances of following symbols:
                for(Entry f : count.entrySet()){
                    System.out.println("   " + f.getKey() + ", " + f.getValue());
                }
            }


        }
        System.out.printf("%nbye...%n");
    }
    }
}

The problem that I'm having now is that it is running through the loop below way too many times:

if (!word.equals("---")) {

    // do something with each word in the input
    // replace this line with your code (probably more than one line of code)

    inputWords.add(word);
}

Does anyone have any ideas or tips on this?

Upvotes: 1

Views: 2703

Answers (1)

Darshan Rivka Whittle
Darshan Rivka Whittle

Reputation: 34071

I find this part confusing:

while (sc.hasNext()) {  
    word = sc.next();       
    if (!word.equals("---")) {
        // do something with each word in the input
        // replace this line with your code (probably more than one line of code)

        inputWords.add(word);
    }

    for(int i = 0; i < inputWords.size() - 1; i++){

I think you probably mean something more like this:

// Add all words (other than "---") into inputWords
while (sc.hasNext()) {  
    word = sc.next();       
    if (!word.equals("---")) {
        inputWords.add(word);
    }
}

// Now iterate over inputWords and process each word one-by-one
for (int i = 0; i < inputWords.size(); i++) {

It looks like you're trying to read all the words into inputWords first and then process them, while your code iterates through the list after every word that you add.

Note also that your condition in the for loop is overly-conservative, so you'll miss the last word. Removing the - 1 will give you an index for each word.

Upvotes: 1

Related Questions