Reputation: 163
I Have many large logfiles which are looks like that:
DATETIME ["2015-03-03 21:52"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","BBB","TEST1"]
DATETIME ["2015-03-03 21:53"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","CCC"]
POST ["POST_JSON","DDD","TEST2"]
DATETIME ["2015-03-03 21:54"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","BBB","TEST3"]
DATETIME ["2015-03-03 21:55"]
SERVER [{json_with_$_SERVER-Output}]
GET ["GET_JSON","AAA"]
POST ["POST_JSON","EEE","TEST4"]
I want to search about 2 keywords (between them are linebreaks). one specific word in the GET-Line and one specific word in the POST-Line.
i need something like:
grep "GET(.*)AAA(.*)POST(.*)BBB"
what im searching for: AAA (in GET-Line) && BBB (In POST-Line)
the expected result:
POST ["POST_JSON","BBB","TEST1"]
POST ["POST_JSON","BBB","TEST3"]
with which simple methods this is doable?
Upvotes: 0
Views: 53
Reputation: 163
i solved this with grep -P for Regular Expressions as i know it from PHP and particularly with -A to get the next n Lines. Then i filtered the result with "|" and grep -P again
Upvotes: 0
Reputation: 203664
Using GNU awk for the 3rd arg to match():
$ find . -type f |
xargs gawk -v RS= 'match($0,/\nGET.*AAA.*\n(POST.*BBB.*)/,a){print a[1]}'
POST ["POST_JSON","BBB","TEST1"]
POST ["POST_JSON","BBB","TEST3"]
Add -v ORS='\n\n'
if you really want a blank line between output lines.
Upvotes: 1
Reputation: 158030
grep
is the command you are searching for
grep -rHn "GET.*KEYWORD_A" -A1 /path/to/files | grep "POST.*KEYWORD_B"
I would first grep for lines containing KEYWORD_A
and append one line after the match since the POST comes after the GET in your logfiles. Then search for KEYWORD_B
-r greps recursively in a directory
-H prints the file name
-n prints the line number
Upvotes: 0