Jaelebi
Jaelebi

Reputation: 6119

parse an email message for sender name in bash

I have multiple files in a folder and each of them have one email message. Each message has a header in the format

Subject: formatting fonts
To: [email protected]
From: sender name

message body

I want to get all the unique sender names from all the messages (there is only 1 message per file) . How can I do that?

Upvotes: 2

Views: 4499

Answers (3)

jabbie
jabbie

Reputation: 2716

To tighten up some of the answers. (I don't have enough reputation yet to comment.) The following should be sufficient:

grep -m 1 '^From: ' * | sed -'s/^From: *//' | sort -u

Will give you a list of unique from addresses for all the messages in the directory. If you want to clean up the address portion you can add more to the sed command like che's answer. There is no need to need to 'cat * | grep'.

Upvotes: 0

John
John

Reputation: 15296

Assuming there can't be random headers in the middle of the messages, then this should do the trick:

cat * | grep '^From: ' | sort -u

If there may be other misleading "From:" lines in the middle of the messages, then you just need to make sure you are only getting the first matching line from each message, like so:

for f in * ; do cat $f | grep '^From: ' | head -1 | sort -u ; done

Obviously you can replace the * in either command with a different glob or list of file names.

Upvotes: 2

che
che

Reputation: 12273

Do you want to filter out sender names or e-mail addresses? Usually you have both in "From" lines, such as

From: Lessie <[email protected]>

The you can use sed to remove the e-mail address part

sed 's/^From: //;s/ *<[^>]*> *//'

ending up with something like this:

ls | while read filename
do
    grep '^From: ' $filename | head -n1 | sed 's/^From: //;s/ *<[^>]*> *//;s/^"//;s/"$//'
done | sort -u

Upvotes: 1

Related Questions