user2023947
user2023947

Reputation: 19

Remove duplicate email addresses based on domain

I want to remove duplicate email addresses based on the domain name. For example:

[email protected]
[email protected]
[email protected]

Should become:

[email protected]
[email protected]

Can anyone help with this? I tried using sort / uniq and awk, but haven't got it working yet.

Upvotes: 0

Views: 284

Answers (1)

quickshiftin
quickshiftin

Reputation: 69681

In php:

<?php
$domains   = []; // list of domains we have already included
$cleanList = []; // "clean" email list
$list      = file('/path/to/email-list.txt'); // load the raw list

// loop over the raw list
foreach($list as $email) {
   // extract the domain from the email
   $domain = preg_replace('/^.*@/', '', $email);

   // if the domain has not been taken yet
   if(!in_array($domain, $domains)) {
      // add it to the list of taken domains
      array_push($domains, $domain);

      // add the email to the clean list
      array_push($cleanList, $email);
   }
}

// write the clean list out to a file
file_put_contents('/tmp/clean-emails.txt', implode("\n", $cleanList));

Upvotes: 1

Related Questions