DemCodeLines
DemCodeLines

Reputation: 1920

Parsing a MySQL log file in Java

I have a MySQL log file that has all sorts of information on each file (When a connection was made, when a query was made, when the connection was ended etc.) I have to parse the log file so I can take the data on each line, put it in an array, then do some calculation based on it.

Here is a sample from the log file:

151011 12:52:51       1 Connect     [email protected] on testdb
              1 Query       SHOW SESSION VARIABLES
              1 Query       SHOW COLLATION
              1 Query       SET character_set_results = NULL
              1 Query       SET autocommit=1
              1 Query       SELECT q1,q2 FROM q_table
              1 Query       SELECT s1,s2 FROM s_table
              1 Query       select count(*) as c from i_table WHERE val = 1
              1 Query       select count(*) as c from k_table WHERE cid = 1
              1 Query       SELECT name,age FROM i_table WHERE ck = 1
151011 12:52:54       1 Query       SELECT name,aid FROM j_table WHERE co = 1
151011 12:52:59       1 Query       SELECT * from values where lastname='smith'

Unfortunately the spaces in the line are not separated by a tab character ("\t"). Worse, some lines have additional date and time at the beginning while some don't. Which means some lines have more data to parse than others. How would I parse this log file?

So far, I had the following:

Scanner scan = new Scanner(new File("data.log"));
ln = scan.nextLine();
ar = ln.split("\t");
System.out.println(ar[0]);
System.out.println(ar[1]);

But that prints the following line, for example:

151018 12:52:51                              // First slot in the array
      1 Connect     [email protected] on tested // Second slot in the array

Is there any way to do this? Or is just not possible?

Upvotes: 1

Views: 280

Answers (1)

Danny
Danny

Reputation: 541

Seems to me you want to do a regex with the following groups separated by whitespace:

  1. the date specific regex (this group is optional)
  2. a number
  3. either "Connect", "Query", or any string that would be in the same place
  4. a group that starts with non-whitespace and continues with anything

    String dateTime, number, type, message;
    Pattern pattern = Pattern.compile(
        "(\\d{6} \\d{2}:\\d{2}:\\d{2})?\\s+(\\d+)\\s+(Connect|Query)\\s+([^\\s].*)");
    Matcher matcher = pattern.matcher(ln);
    
    if (matcher.matches()) {
        dateTime = matcher.group(1);//this will be null if no date
        number = matcher.group(2);
        type = matcher.group(3);
        message = matcher.group(4);
    }
    

Upvotes: 2

Related Questions