Reputation: 3433
Context:
It is a log analysis thing. I am creating a regex
program to find occurrence of certain requests send to a server from a client. I have the client log file containing these requests along with other logs.
Problem: When a request message is send to server, the client should have 2 log statements like:
sending..
message_type
when the above statements or pattern found we can say one request has been sent.It is combined pattern. Ok
We are expecting the log file content will be like
sending..
message_type
...//other text
sending..
message_type
...//other text
sending..
message_type
From the above log we can say client has sent 3 messages. But in the actual log file somehow, the patterns are overlapping as below(not for all messages, but for some):
sending..(1)
...//other text
sending..(2)
message_type(2)
...//other text
message_type(1)
sending..(3)
message_type(3)
Still 3 requests(i numbered messages to understand). But the pattern is overlapped.i.e before logging first message fully , second message got logged. The above explanation is for understanding. Below is the part of original log:
Original log
Send message to server:
Created post notification log dir
Created post notification log dir
Created post notification log dir
Send message to server:
Created post notification log dir
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><request xaction_guid="new xaction guid" type="createsession"/></message>
INFO [a] - Server Response: <?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><response xaction_guid="new xaction guid" type="ok"></params></response></message>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><request xaction_guid="new xaction guid" type="createsession"/></message>
INFO [a] - Server Response: <?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><response xaction_guid="new xaction guid" type="ok"></response></message>
here as per the explanation single request will be identified with its 2 parts:
Send message to server:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><request xaction_guid="new xaction guid" type="createsession"/></message>
What I tried
public class LogMatcher {
static final String create_session= "Send message to server(.){10,1000}(<\\?xml(.){10,500}type=\"createsession\"(.){1,100}</message>)";
public static void main(String[] args) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(new File("D:/dummy.txt"))));//I put the above log in this file
StringBuilder b = new StringBuilder();
String line = "";
while((line = reader.readLine()) != null ){
b.append(line);
}
findMatch(b,"Send message to server","Send message to server");
findMatch(b,create_session,"create_session");
}
private static int findMatch(StringBuilder b,String pattern, String type) {
int count =0;
Pattern regex = Pattern.compile(pattern,Pattern.MULTILINE);
Matcher regexMatcher = regex.matcher(b.toString());
while (regexMatcher.find()) {
count++;
}
System.out.printf("%25s%2d\n",type+": ",count);
return count;
}
}
Current Output
Intention is to find out the number of createsession
messages sent
Send message to server: 2
create_session: 1
Expected output
From the log it is clear that 2 messages sent.So out put will be:
Send message to server: 2
create_session: 2
You can see the pattern I have tried in my code. Can anyone suggest a pattern to get the desired result?
Note: One can simply say why not use the count Send message to server
alone. Because in the log there are many type of messages like login, closesession
etc. All of them having the first part as Send message to server
.Also they have logged message types alone for some other purpose so we can't relay on any part(meaning only the combination we could relay on)
Upvotes: 2
Views: 82
Reputation: 6511
Find occurrence of certain requests send to a server from a client.
"other way" that you can neglect here , that will have like
Store in DB :
instead ofSend message to server
and the xml message.
I'd propose a new strategy:
type=\"createsession\"
xmls independently.Store in DB:
xmls, but ignore them (don't increment the counter).We can use the following expression to match the number of messages sent to server.
^(?<toserver>Send message to server:)
regexMatcher.group("toserver")
to increment the counter.And match the target xmls independently as:
^(?<message><\? *xml\b.{10,500} type *= *\"createsession\")
regexMatcher.group("message")
.So, how do we ignore Store in DB:
xmls? We can match them, while not creating a capture.
^Store in DB ?:\r?\n(?:.*\n)*?<\? *xml\b.*
Store in DB :
, followed by\r?\n(?:.*\n)*?
as few lines as possible, until<\? *xml\b.*
it matches the fist <?xml
lineRegex
^(?:Store in DB ?:\r?\n(?:.*\n)*?<\? *xml\b.*|(?<toserver>Send message to server:)|(?<message><\? *xml\b.{10,500} type *= *\"createsession\"))
Code
static final String create_session = "^(?:Store in DB ?:\\r?\\n(?:.*\\n)*?<\\? *xml\\b.*|(?<toserver>Send message to server:)|(?<message><\\? *xml\\b.{10,500} type *= *\\\"createsession\\\"))";
public static void main (String[] args) throws java.lang.Exception
{
//for testing purposes
final String text = "Send message to server:\nCreated post notification log dir\nCreated post notification log dir\nCreated post notification log dir\nSend message to server:\nCreated post notification log dir\n<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?><message schema_version=\"3644767c-2632-411a-9416-44f8a7dee08e\"><request xaction_guid=\"new xaction guid\" type=\"createsession\"/></message>\nStore in DB :\n<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?><message schema_version=\"3644767c-2632-411a-9416-44f8a7dee08e\"><request xaction_guid=\"new xaction guid\" type=\"createsession\"/></message>\nINFO [a] - Server Response: <?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?><message schema_version=\"3644767c-2632-411a-9416-44f8a7dee08e\"><response xaction_guid=\"new xaction guid\" type=\"ok\"></params></response></message>\n<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?><message schema_version=\"3644767c-2632-411a-9416-44f8a7dee08e\"><request xaction_guid=\"new xaction guid\" type=\"createsession\"/></message>\nINFO [a] - Server Response: <?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?><message schema_version=\"3644767c-2632-411a-9416-44f8a7dee08e\"><response xaction_guid=\"new xaction guid\" type=\"ok\"></response></message>";
System.out.println("INPUT:\n" + text + "\n\nCOUNT:");
StringBuilder b = new StringBuilder();
b.append(text);
findMatch(b,create_session,"create_session");
}
private static int findMatch(StringBuilder b,String pattern, String type) {
int count =0; // counter for "Send message to server:"
int countType=0; // counter for "type=\"createsession\""
Pattern regex = Pattern.compile(pattern,Pattern.MULTILINE);
Matcher regexMatcher = regex.matcher(b.toString());
while (regexMatcher.find()) {
if (regexMatcher.group("toserver") != null) {
count++;
} else if (regexMatcher.group("message") != null) {
countType++;
} else {
// Ignoring "Store in DB :\n<?xml...."
}
}
System.out.printf("%25s%2d\n%25s%2d\n", "to server: ", count, type+": ", countType);
return countType;
}
Output
INPUT:
Send message to server:
Created post notification log dir
Created post notification log dir
Created post notification log dir
Send message to server:
Created post notification log dir
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><request xaction_guid="new xaction guid" type="createsession"/></message>
Store in DB :
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><request xaction_guid="new xaction guid" type="createsession"/></message>
INFO [a] - Server Response: <?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><response xaction_guid="new xaction guid" type="ok"></params></response></message>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><request xaction_guid="new xaction guid" type="createsession"/></message>
INFO [a] - Server Response: <?xml version="1.0" encoding="UTF-8" standalone="yes"?><message schema_version="3644767c-2632-411a-9416-44f8a7dee08e"><response xaction_guid="new xaction guid" type="ok"></response></message>
COUNT:
to server: 2
create_session: 2
Upvotes: 1