Deano
Deano

Reputation: 12190

extract information from file to create table

Jan 29 12:28:17 torsmtp2 postfix/cleanup[16193]: 5513512078E: warning: header Subject: Well Systems - Project Updated (Published Number 561-639-2188) from unknown[10.40.6.11]; from=<[email protected]> to=<[email protected]> proto=ESMTP helo=<CORE1UI1>

I'm trying to extract the information and create a table that contain the following:

DATE                EMAIL                Published Number
Jan 29 12:28:17     [email protected]      561-639-2188

is it possible to use awk or sed to accomplish this?

I was able to do the following so far

head -n 1 file | awk -F ',' 'BEGIN { print "-----------------------\nDate \tEmail\tPhone\n-----------------------"} { print $1;} END { print "-------------"; }'

output

-----------------------
Date    Email   Phone
-----------------------
Jan 29 12:28:17 torsmtp2 postfix/cleanup[16193]: 5513512078E: warning: header Subject:       American Ramp Systems - Study Updated (Published Number 888-649-2186) from     unknown[10.40.6.11]; from=<[email protected]> to=<[email protected]> proto=ESMTP helo=    <CORE1UI1>
-------------

still not sure how I can extract the date / Published number and email

thank you

Upvotes: 0

Views: 311

Answers (3)

Mirage
Mirage

Reputation: 31548

Other way using sed

sed -re 's/(.*[0-9]:[0-9]+)(.*)Published Number ([0-9-]+)(.*)to=<(\w+@\w+\.\w+)(.*)>/\1\t\5\t\3/' temp.txt

Upvotes: 2

Ed Morton
Ed Morton

Reputation: 203254

awk -v OFS='\t' '{date=$1" "$2" "$3; email=phone=$0; gsub(/.*to=<|>.*$/,"",email);
 gsub(/.*Published Number |).*/,"",phone); print date, email, phone}' file
Jan 29 12:28:17 [email protected] 561-639-2188

add printing the header in a BEGIN section and use printf instead of print if you want something other than tab-separated values in the output.

Upvotes: 1

jitendra
jitendra

Reputation: 1458

Since, this looks like a log file, I am assuming the format won't change between different records:

You can extract date using the following code:
date=$(cat extract.txt | cut -d ' ' -f -3)

You can extract to email using the following snippet (I know it is a bit complicated though):
email=$(cat extract.txt | sed 's/.*\( to[^ ]*\).*/\1/g' | cut -d '<' -f2 | cut -d '>' -f1)

And, the published number can be extracted as follows:
number=$(cat extract.txt | sed 's/.*Published Number \([^)]*\).*/\1/g')

I hope this helps.

Update:
Email can be much easily extracted using the following snippet:
email=$(cat extract.txt | sed 's/.* to=<\([^>]*\).*/\1/g')

Upvotes: 1

Related Questions