Madhur Ahuja
Madhur Ahuja

Reputation: 22681

Parsing log file

I am trying to parse a text like this from a log file:

[2016-01-29 11:31:33,809: WARNING/Worker-1283] 1030140:::DEAL_OF_DAY:::29:::1:::11 [2016-01-29 11:31:34,103: WARNING/Worker-1197] 1025311:::DEAL_OF_DAY:::29:::1:::11 [2016-01-29 11:31:34,291: WARNING/Worker-1197] 1025158:::DEAL_OF_DAY:::29:::1:::11

I want to extract these numbers 1030140, 1025311, 1025158 and so on.

I have tried the following

cat deals29.txt | egrep -o '[0-9]+'

But this gives other digits as well

I tried
cat deals29.txt | egrep -o ' [0-9]+:::'

but now it gives the colons in the output as well and there is no way to capture the group in the command line version of grep.

Any suggestions? grep solution would be preferred but I can go with sed/awk as well if grep cannot do the job.

Upvotes: 1

Views: 60

Answers (3)

Jan
Jan

Reputation: 43169

You could use a solution like:

(\d{3,})::
# looks for at least 3 digits (or more) followed by two colons
# puts the matched numbers in group 1

See a demo for this approach here.

Upvotes: 0

anubhava
anubhava

Reputation: 785146

Using grep -oP and match reset \K:

grep -oP '^\[.*?\] \K\d+' file.log
1030140
1025311
1025158

If your grep doesn't support -P (PCRE) then use awk:

awk -F '\\] |:::' '{print $2}' file.log
1030140
1025311
1025158

Upvotes: 2

user3698890
user3698890

Reputation: 19

You can train regex here : https://regex101.com/

I get

] [0-9]* 

and you have to delete the first 2 chars

Upvotes: 0

Related Questions