Reputation: 43
I have a huge log.txt file from which I need to calculate the maximum and minimum values.
0-00:42:35.598 <tc_testcase>:[DEFAULT]:[PRINT]: tc_testcase.c:8963: VERIFY_CASE: range: X_PER_Y_USER_RANGE_0[1](a, 1)=121, (0)=0, (25000)=25000, (ok)
0-00:42:35.598 <tc_testcase>:[DEFAULT]:[PRINT]: tc_testcase.c:8963: VERIFY_CASE: range: X_PER_Y_USER_RANGE_0[1](a, 1)=256879, (0)=0, (25000)=25000, (ok)
0-00:42:35.598 <tc_testcase>:[DEFAULT]:[PRINT]: tc_testcase.c:8963: VERIFY_CASE: range: X_PER_Y_USER_RANGE_0[1](a, 1)=2300, (0)=0, (25000)=25000, (ok)
0-00:42:35.598 <tc_testcase>:[DEFAULT]:[PRINT]: tc_testcase.c:8963: VERIFY_CASE: range: X_PER_Y_USER_RANGE_0[1](a, 1)=56897132, (0)=0, (25000)=25000, (ok)
0-00:42:35.598 <tc_testcase>:[DEFAULT]:[PRINT]: tc_testcase.c:8963: VERIFY_CASE: range: X_PER_Y_USER_RANGE_0[1](a, 1)=12579, (0)=0, (25000)=25000, (ok)
0-00:42:35.598 <tc_testcase>:[DEFAULT]:[PRINT]: tc_testcase.c:8963: VERIFY_CASE: range: X_PER_Y_USER_RANGE_0[1](a, 1)=968746, (0)=0, (25000)=25000, (ok)
so, as the first step I started to collect lines as below into a separate file with the below grep command and later to sort it:
grep -Po '.X_PER_Y_USER_RANGE_0.*[1].*(a, 1).\K.*' file.txt > collect.txt
But the output I receive is quite different. It looks something like this:-
=121, (25000)=25000, (ok)
=256879, (25000)=25000, (ok)
=2300, (25000)=25000, (ok)
=56897132, (25000)=25000, (ok)
=12579, (25000)=25000, (ok)
=968746, (25000)=25000, (ok)
Expected should be:-
121
256879
2300
56897132
12579
968746
Can anyone help to modify the current grep command i'm using to collect the values as the expected output
Upvotes: 3
Views: 1010
Reputation: 23864
You just needed to escape the special regex characters :
grep -Po 'range: X_PER_Y_USER_RANGE_0\[1\]\(a, \d\)=\K\d+' file.txt
121
256879
2300
56897132
12579
968746
For such a case (to me) Perl itself is preferable :
perl -lne '/ \d\)=\K\d+/ && print $&' file.txt
121
256879
2300
56897132
12579
968746
and for min and max something like this:
perl -lne '/ \d\)=\K\d+/g && push(@number,$&); END{ print "@{[sort {$a <=> $b} @number]}[0,-1]"}' file.txt
121 56897132
and END { ... } means sort them and take out 0
and last
index of the array of @number
For sort
using bash and not Perl to have min and max you can pipe your output to:
... | { mapfile -t arr; paste <(sort -n <(tr ' ' '\n' <<< ${arr[@]})) <(sort -rn <(tr ' ' '\n' <<< ${arr[@]})); }
121 56897132
2300 968746
12579 256879
256879 12579
968746 2300
56897132 121
which creates an array and then sort
it twice. Then just pipe it to head -n 1
e.g.
... | head -n 1
121 56897132
Upvotes: 2
Reputation: 133620
You were close, trying to fix OP's attempt here. This could be done in GNU grep
like following. We could use GNU grep
's -P
option to enable PCRE regex there.
grep -oP '.*range: X_PER_Y_USER_RANGE_0\[1\]\([a-zA-Z]+, \d+\)=\K\d+' Input_file
Explanation: Simple explanation would be, firstly enabling -oP
options for GNU grep
to enable PCRE regex capabilities and to get only matched values. Then in main program of grep, using regex .*range: X_PER_Y_USER_RANGE_0\[1\]\([a-zA-Z]+, \d+\)=
to match everything from starting to till (), note here escaping part of [
, ]
,(
,)
to make those characters as literal ones. Then using \K
option to forget all the matched values till now. Mentioning \d+
will match digits after it which is OP's actual requirement to get the digits.
Upvotes: 6