Reputation: 47
I have a log file which contains date time and then report for each error. Each error starts with date time pattern. I am getting an id as parameter in shell script and want to put the error report with corresponding id to a new file. I am new to bash and tried it using grep and cut, but grep doesn't take more than 1 character. Also reading line by line and searching the key isn't feasible as id is present 2-3 line after error report for the particular id starts. Help me! Thanks.
Below is example of log.
2015-09-25 03:34:40 ................<event>
<id>xxx</id>
<msg>.......: ErrorName1 ===
............
..........
.....
</event>
2015-09-25 03:34:42 .................<event>
<id>yyy</id>
<msg>.......: ErrorName2 ===
............
..........
.....
</event>
EDIT: All errors do not have same number of lines and some of the events have same error id. So if I request for particular error id, all these events with same error id should be put in different files
Upvotes: 0
Views: 300
Reputation: 1074
This helps you for catching id xxx by reading inputfile
and dumps the matching result to outputfile
grep -Poz '(?s)^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.*?<event>.*?<id>xxx</id>.*?</event>' inputfile > outputfile
From the man of grep
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
-z, --null-data
Treat the input as a set of lines, each terminated by a zero
byte (the ASCII NUL character) instead of a newline. Like the
-Z or --null option, this option can be used with commands like
sort -z to process arbitrary file names.
-P, --perl-regexp
Interpret PATTERN as a Perl regular expression (PCRE, see below). This is highly experimental and
grep -P may warn of unimplemented features.
(?s) - makes a match across multiline
I made bash script for your problem and here it is. You need to pass the input file as the first argument and id of the event as the second argument to the script. It saves your log to different files for each event. I hope you benefit from this. I could not find a solution other than reading line by line.
#!/bin/bash
inputfile="$1"
ID="$2"
let found=0
let counter=1
cumul=""
function searchevent(){
output=$(echo "$cumul" | grep -Poz "(?s)^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.*?<event>.*?<id>$ID</id>.*?</event>" 2>/dev/null)
if [ $? -eq 0 ]
then
echo "$output" >> "outputfile_""$ID""_$counter.log"
let counter++
fi
}
while read line; do
if echo "$line" | grep -P '[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}' &>/dev/null
then
let found++
fi
if [ "$found" -eq 1 ]
then
cumul="$cumul"$'\n'"$line"
else
if [ "$found" -eq 2 ]
then
searchevent
let found=1
cumul="$line"
fi
fi
done < "$inputfile"
if [ "$found" -eq 1 ]
then
searchevent
fi
Upvotes: 1
Reputation: 1317
awk can help.
awk '{if ($0~/"<event>"/)k=1;if (k==1)print $0;if ($0~/"</event>"/)k=0}' inputfile > outputfile
Upvotes: 0
Reputation: 2724
Not sure if you're really 'splitting' the file. According to your description you're trying to extract a part of it given some id. If each of your events has the same number of lines (like in your example data), you'll be good with:
<your_file grep -B 1 -A 5 '<id>your_id</id>'
Where -A n
means n lines after the match, -B n
means n lines before the match.
Upvotes: 0