Terezi
Terezi

Reputation: 33

Java File I/O help

I have a problem with my code. I need to do several operations on a log file with this structure:

190.12.1.100 2011-03-02 12:12 test.html  
190.12.1.100 2011-03-03 13:18 data.html  
128.33.100.1 2011-03-03 15:25 test.html  
128.33.100.1 2011-03-04 18:30 info.html

I need to get the number of visits per month, number of visits per page and number of unique visitors based on the IP. That is not the question, I managed to get all three operations working. The problem is, only the first choice runs correctly while the other choices just return values of 0 afterwards, as if the file is empty, so i am guessing i made a mistake with the I/O somewhere. Here's the code:

import java.io.*;
import java.util.*;

public class WebServerAnalyzer {

private Map<String, Integer> hm1;
private Map<String, Integer> hm2;
private int[] months;
private Scanner input;

public WebServerAnalyzer() throws IOException {
  hm1 = new HashMap<String, Integer>();
  hm2 = new HashMap<String, Integer>();
  months = new int[12];
  for (int i = 0; i < 12; i++) {
      months[i] = 0;
  }
  File file = new File("webserver.log");
  try {
      input = new Scanner(file);
  } catch (FileNotFoundException fne) {
      input = null;
  }
}

public String nextLine() {
  String line = null;
  if (input != null && input.hasNextLine()) {
    line = input.nextLine();
  }
  return line;
}

public int getMonth(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
    if (dtok.countTokens() == 3) {
      String year = dtok.nextToken();
      String month = dtok.nextToken();
      String day = dtok.nextToken();
      int m = Integer.parseInt(month);
        return m;
    }
  }
  return -1;
}

public String getIP(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return ip;
  }
  return null;
}

public String getPage(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return page;
  }
  return null;
}

public void visitsPerMonth() {
  String line = null;
  do {
    line = nextLine();
    if (line != null) {
      int m = getMonth(line);
      if (m != -1) {
        months[m - 1]++;
      }
    }
  } while (line != null);

  // Print the result
  String[] monthName = {"JAN ", "FEB ", "MAR ",
      "APR ", "MAY ", "JUN ", "JUL ", "AUG ", "SEP ",
      "OCT ", "NOV ", "DEC "};
  for (int i = 0; i < 12; i++) {
    System.out.println(monthName[i] + months[i]);
  }
}

public int count() throws IOException {
  InputStream is = new BufferedInputStream(new FileInputStream("webserver.log"));
  try {
    byte[] c = new byte[1024];
    int count = 0;
    int readChars = 0;
    while ((readChars = is.read(c)) != -1) {
      for (int i = 0; i < readChars; ++i) {
        if (c[i] == '\n')
          ++count;
      }
    }
    return count;
  } finally {
    is.close();
  }
}


public void UniqueIP() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm1.containsKey(getIP(line)) == false) {
        hm1.put(getIP(line), 1);
      } else {
        hm1.put(getIP(line), hm1.get(getIP(line)) +1 );
      }
    }
  }

  Set set = hm1.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of unique visitors: " + hm1.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

public void pageVisits() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm2.containsKey(getPage(line)) == false)
        hm2.put(getPage(line), 1);
      else
        hm2.put(getPage(line), hm2.get(getPage(line)) +1 );
    }
  }
  Set set = hm2.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of pages visited: " + hm2.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

Any help figuring out the problem would be much appreciated as I am quite stuck.

Upvotes: 0

Views: 250

Answers (2)

J&#246;rn Horstmann
J&#246;rn Horstmann

Reputation: 34014

The reset method of BufferedReader that Thomas recommended would only work if the file size is smaller than the buffer size or if you called mark with a large enough read ahead limit.

I would recommend reading throught the file once and to update your maps and month array for each line. BTW, you don't need a Scanner just to read lines, BufferedReader has a readLine method itself.

BufferedReader br = ...;
String line;
while (null != (line = br.readLine())) {
    String ip = getIP(line);
    String page = getPage(line);
    int month = getMonth(line);
    // update hashmaps and arrays
}

Upvotes: 2

Thomas
Thomas

Reputation: 88707

I didn't read the code thoroughly yet, but I guess you're not setting the read position back to the beginning of the file when you start a new operation. Thus nextLine() would return null.

You should create a new Scanner for each operation and close it afterwards. AFAIK scanner doesn't provide a method to go back to the first byte.

Currently I could also think of 3 alternatives:

  1. Use a BufferedReader and call reset() for each new operation. This should cause the reader to go back to byte 0 provided you didn't call mark() somewhere.

  2. Read the file contents once and iterate over the lines in memory, i.e. put all lines into a List<String> and then start at each line.

  3. Read the file once, parse each line and construct an apropriate data structure that contains the data you need. For example, you could use a TreeMap<Date, Map<Page, Map<IPAdress, List<Visit>>>>, i.e. you'd store the visits per ip address per page for each date. You could then select the appropriate submaps by date, page and ip address.

Upvotes: 4

Related Questions