Reputation: 1
I have approximately 96K text emails that I want to extract the sender's address for. I believe that I can use domdoc for this but need someone to start me off. Can someone please advise whether there is a better way of doing this?
Thanks, Jim
Upvotes: 0
Views: 746
Reputation: 1118
See no reason to do this in PHP... Provided the files are in some form of flat text, copy the file(s) to (for example) the emails/ directory, then
cat * | grep "From: " | egrep -oi ‘\b[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}’ | sort | uniq > mail.list
Of course if you have to do this in PHP then
Upvotes: 2
Reputation: 1356
Using a regular expression in some form would be the best way to do it. If you can save your text emails to files, you can use something like Textpad to search for email addresses based on the regular expression.
You should be able to find regular expressions for email addresses online.
Upvotes: 0