Reputation: 1
I need to split a large syslog file that goes from October 2015 to February 2016 and be separated by month. Due to background log retention, the format of these logs are similar to:
Oct 21 08:00:00 - Log info
Nov 16 08:00:00 - Log Info
Dec 25 08:00:00 - Log Info
Jan 11 08:00:00 - Log Info
Feb 16 08:00:00 - Log Info
This large file is the result of an initial zgrep search across a large amount of log files split by day. Example being, user activity on a network across multiple services such as Windows/Firewall/Physical access logs.
For a previous request, I used the following:
gawk 'BEGIN{
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",mth,"|")
}
{
for(i=1;i<=m;i++){ if ( mth[i]==$1){ month = i } }
tt="2015 "month" "$2" 00 00 00"
date= strftime("%Y%m",mktime(tt))
print $0 > FILENAME"."date".txt"
}
' logfile
output file examples (note sometimes I add "%d" to get the day but not this time:
Test.201503.txt
Test.201504.txt
Test.201505.txt
Test.201506.txt
This script however adds 2015 manually to the output log file name. What I attempted, and failed to do, was a script that creates variables out of each month at 1-12 and then sets 2015 as a variable (a) and 2016 as variable (b). Then the script would be able to compare when going in the order of 10, 11, 12, 1, 2 which would go in order and once it gets to 1 < 12 (the previous month) it would know to use 2016 instead of 2015. Odd request I know, but any ideas would at least help me get in the right mindset.
Upvotes: 0
Views: 449
Reputation: 10149
Here is a gawk solution based on your script and your observation in the question. The idea is to detect a new year when the number of the month suddenly gets smaller, eg from 12 to 1. (Of course that will not work if the log has Jan 2015 directly followed by Jan 2016.)
script.awk
BEGIN { START_YEAR= 2015
# configure months and a mapping month -> nr, e.g. "Feb" |-> "02"
split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",monthNames,"|")
for( nr in monthNames) { month2Nr[ monthNames[ nr ] ] = sprintf("%02d", nr ) }
yearCounter=0
}
{
currMonth = month2Nr[ $1 ]
# detect a jump to the next year by a reset in the month number
if( prevMonth > currMonth) { yearCounter++ }
newFilename = sprintf("%s.%d%s.txt", FILENAME, (START_YEAR + yearCounter), currMonth)
prevMonth = currMonth
print $0 > newFilename
}
Use it like this: awk -f script.awk logfile
Upvotes: 1
Reputation: 2691
You could use date
to parse the date and time. E.g.
#!/bin/bash
while IFS=- read -r time info; do
mon=$(date --date "$time" +%m | sed 's/^0//')
if (( mon < 10 )); then
year=2016
else
year=2015
fi
echo $time - $info > Test.$year$(printf "02d%" $mon).txt
done
Upvotes: 1