Reputation: 1332
I have a requests log like this:
[11/Jun/2020:15:35:20 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=72161.647 memory=2 cpu=0.01%
[11/Jun/2020:15:22:13 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=70564.992 memory=2 cpu=0.00%
[11/Jun/2020:15:35:26 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=70252.369 memory=2 cpu=0.00%
[11/Jun/2020:15:01:02 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=60159.409 memory=2 cpu=0.03%
[11/Jun/2020:14:59:03 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=106956.770 memory=2 cpu=0.01%
[11/Jun/2020:15:37:56 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=60014.014 memory=2 cpu=0.00%
[11/Jun/2020:16:45:38 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=61264.044 memory=2 cpu=0.02%
[11/Jun/2020:15:01:48 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=58733.325 memory=2 cpu=0.02%
[11/Jun/2020:15:31:35 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=68882.501 memory=2 cpu=0.03%
[11/Jun/2020:14:59:46 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=57021.375 memory=2 cpu=0.00%
[11/Jun/2020:14:59:46 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=137172.179 memory=2 cpu=0.01%
[11/Jun/2020:15:35:39 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=107954.112 memory=2 cpu=0.00%
[11/Jun/2020:16:12:22 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=55877.479 memory=2 cpu=0.02%
[11/Jun/2020:15:26:19 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=55912.678 memory=2 cpu=0.00%
[11/Jun/2020:15:36:33 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=54738.373 memory=2 cpu=0.02%
And I have a script to sort by time, memory and cpu, but I can do it only if I remove the static string time=
before sort.
cat /var/log/requests.log | sed -e "s/time=//" | sort -k 7 -n -r | head -50
I get
[11/Jun/2020:14:59:46 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 137172.179 memory=2 cpu=0.01%
[11/Jun/2020:15:35:39 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 107954.112 memory=2 cpu=0.00%
[11/Jun/2020:14:59:03 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 106956.770 memory=2 cpu=0.01%
[11/Jun/2020:15:35:20 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 72161.647 memory=2 cpu=0.01%
[11/Jun/2020:15:22:13 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 70564.992 memory=2 cpu=0.00%
[11/Jun/2020:15:35:26 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 70252.369 memory=2 cpu=0.00%
[11/Jun/2020:15:31:35 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 68882.501 memory=2 cpu=0.03%
[11/Jun/2020:16:45:38 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 61264.044 memory=2 cpu=0.02%
[11/Jun/2020:15:01:02 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 60159.409 memory=2 cpu=0.03%
[11/Jun/2020:15:37:56 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 60014.014 memory=2 cpu=0.00%
[11/Jun/2020:15:01:48 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 58733.325 memory=2 cpu=0.02%
[11/Jun/2020:14:59:46 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 57021.375 memory=2 cpu=0.00%
[11/Jun/2020:15:26:19 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 55912.678 memory=2 cpu=0.00%
[11/Jun/2020:16:12:22 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 55877.479 memory=2 cpu=0.02%
[11/Jun/2020:15:47:01 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX 55443.752 memory=2 cpu=0.02%
I want to sort the list without removing the sort key.
[11/Jun/2020:14:59:46 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=137172.179 memory=2 cpu=0.01%
[11/Jun/2020:15:35:39 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=107954.112 memory=2 cpu=0.00%
[11/Jun/2020:14:59:03 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=106956.770 memory=2 cpu=0.01%
[11/Jun/2020:15:35:20 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=72161.647 memory=2 cpu=0.01%
[11/Jun/2020:15:22:13 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=70564.992 memory=2 cpu=0.00%
[11/Jun/2020:15:35:26 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=70252.369 memory=2 cpu=0.00%
[11/Jun/2020:15:31:35 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=68882.501 memory=2 cpu=0.03%
[11/Jun/2020:16:45:38 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=61264.044 memory=2 cpu=0.02%
[11/Jun/2020:15:01:02 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=60159.409 memory=2 cpu=0.03%
[11/Jun/2020:15:37:56 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=60014.014 memory=2 cpu=0.00%
[11/Jun/2020:15:01:48 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=58733.325 memory=2 cpu=0.02%
[11/Jun/2020:14:59:46 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=57021.375 memory=2 cpu=0.00%
[11/Jun/2020:15:26:19 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=55912.678 memory=2 cpu=0.00%
[11/Jun/2020:16:12:22 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=55877.479 memory=2 cpu=0.02%
[11/Jun/2020:15:47:01 +0000] 200 GET /endpoint ip=XXX.XXX.XXX.XXX time=55443.752 memory=2 cpu=0.02%
I have tried with, but no success:
cat /var/log/requests.log | sort -k 7.6 -n -r | head -50
Update: /endpoint
are real endpoints, then they can include query string.
Update 2: I need to sort for any of key=value
column (as number).
Upvotes: 1
Views: 174
Reputation: 189377
If your input is properly representative, you can simply use =
as the column separator instead.
sort -t = -k3 -k4 -k5 -n -r /var/log/requests.log
Notice also how we avoid the useless cat
.
More generally, you could use a simple Awk script to extract the sort fields and put them first, then sort on those, then discard them (known as the Schwartzian transform).
awk '{ for(i=1; i<=NF; ++i) if ($i ~ /^(time|memory|cpu)=/) {
split($i, f, "="); a[f[1]] = substr($i, length(f[1])+2) }
print a["time"] "\t" a["memory"] "\t" a["cpu"] "\t" $0 }' /var/log/requests.log |
sort -r -n |
cut -f4-
The if
statement pulls out any field which contains a prefix we are interested in (you can add more keys here if you like, or switch to a more general regular expression if you want to extract everything which contains an equals sign after a sequence of alphabetics, for example) and populates the associative array a
with their respective values. Once we have looped over all the fields, we extract the values from the array in the order we wish to use for sorting.
Demo: https://ideone.com/dU9v95
Upvotes: 3