Reputation: 19
I want to remove duplicate email addresses based on the domain name. For example:
[email protected]
[email protected]
[email protected]
Should become:
[email protected]
[email protected]
Can anyone help with this? I tried using sort / uniq and awk, but haven't got it working yet.
Upvotes: 0
Views: 284
Reputation: 69681
In php:
<?php
$domains = []; // list of domains we have already included
$cleanList = []; // "clean" email list
$list = file('/path/to/email-list.txt'); // load the raw list
// loop over the raw list
foreach($list as $email) {
// extract the domain from the email
$domain = preg_replace('/^.*@/', '', $email);
// if the domain has not been taken yet
if(!in_array($domain, $domains)) {
// add it to the list of taken domains
array_push($domains, $domain);
// add the email to the clean list
array_push($cleanList, $email);
}
}
// write the clean list out to a file
file_put_contents('/tmp/clean-emails.txt', implode("\n", $cleanList));
Upvotes: 1