Reputation: 265
I have a log file with a standard format, e.g.:
31 Mar - Lorem Ipsom1
31 Mar - Lorem Ipsom2
31 Mar - Lorem Ipsom3
The replacement I want to implement is 31*31 to 31 so I'll end up with a log that has only its last line, in this example it will look like:
31 Mar - Lorem Ipsom3
I wish to perform it on a customized linux machine that has no perl. I tried to use sed like this:
sed -i -- 's/31*31/31/g' /var/log/prog/logFile
But it did nothing.. Any alternatives involving ninja bash commands are also welcomed.
Upvotes: 1
Views: 148
Reputation: 914
I think you might be looking for "tail" to get the last line of the file e.g.
tail -1 /path/file
or if you want the last entry from each day then "sort" might be your solution
sort -ur -k 1,2 /path/file | sort
-u
flag specifies only a single match for the keyfields will be returned-k 1,2
specifies that the keyfields are the first two fields - in this case they are the month and the date - fields by default are separated by white space.-r
flag reverses the lines such that the last match for each date will be returned. Sort a second time to restore the original order.If your log file has more than a single month of data, and you wish to preserve order (e.g. if you have Mar 31 and Apr 1 in the same file) you can try:
cat -n tmp2 | sort -nr | sort -u -k 2,3 | sort -n | cut -f 2-
cat -n
adds the line number to the log file before sorting. sort
as before but use fields 2 and 3, because field 1 is now the original line numbersort
by the original line number to restore the original order.cut
to remove the line numbers and restore the original line content.e.g.
$ cat tmp2
30 Mar - Lorem Ipsom2
30 Mar - Lorem Ipsom1
31 Mar - Lorem Ipsom1
31 Mar - Lorem Ipsom2
31 Mar - Lorem Ipsom3
1 Apr - Lorem Ipsom1
1 Apr - Lorem Ipsom2
$ cat -n tmp2 | sort -r | sort -u -k 2,3 | sort | cut -f 2-
30 Mar - Lorem Ipsom1
31 Mar - Lorem Ipsom3
1 Apr - Lorem Ipsom2
Upvotes: 0
Reputation: 477170
*
is not a wildcard as it is in the shell, it is a quantifier. You need to quantify over .
(any character). The regex is thus:
sed ':a;N;$!ba;s/31.*31/31/g'
(I removed the -i
flag so you can first test your file safely).
The :a;N;$!ba;
part makes it possible to process over new lines.
Note however:
The regex will match any 31
so:
31 Mar - Lorem Ipsom1
31 Mar - Lorem 31 Ipsom2
Will result in
31 Ipsom2
It will match greedy, if the log reads:
31 Mar - Lorem Ipsom1
30 Mar - Lorem Ipsom2
31 Mar - Lorem Ipsom3
It remove the second line.
You can solve the first problem by writing:
sed ':a;N;$!ba;s/(^|\n)31.*\n31/31/g'
Which forces the regex that second 31
is located at the beginning of the line.
Upvotes: 2
Reputation: 44063
A way to keep only the last of consecutive lines that match a pattern is
sed -n '/^31/ { :a $!{ h; n; //ba; x; G } }; p' filename
This works as follows:
/^31/ { # if a line begins with 31
:a # jump label for looping
$!{ # if the end of input has not been reached (otherwise the current
# line is the last line of the block by virtue of being the last
# line)
h # hold the current line
n # fetch the next line. (note that this doesn't print the line
# because of -n)
//ba # if that line also begins with 31, go to :a. // attempts the
# most recently attempted regex again, which was ^31
x # swap hold buffer, pattern space
G # append hold buffer to pattern space. The PS now contains
# the last line of the block followed by the first line that
# comes after it
}
}
p # in the end, print the result
This avoids some problems of mult-line regular expressions such as matches that begin or end in the middle of a line. It will also not discard lines between two blocks of matching lines and keep the last line of each block.
Upvotes: 4