Reputation: 569
I have a big text file which is a list of e-mails (each followed by a /n).
I would like to run a perl command to make files with different lists based on whether the e-mail contains a certain string.
So far I have:
perl -wne'
while (/[\w\.\-]+@[\w\.\-]+\w+/g) {
print if "$&\n /gmail/;
}
' all_emails_extracted.csv | sort -u > output.txt
This should write the e-mail if it contains 'gmail' but I get syntax errors no matter how I structure the area around the {print if}
Upvotes: 1
Views: 310
Reputation: 5619
The error in your code has already been pointed out, so here's another suggestion: use Email::Address:
$ cat addresses
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
bob @ yahoo.com
bob @ springfield-amusement-park.com
[email protected]
$ perl -MEmail::Address -lne 'for (Email::Address->parse($_)) { $bobs{$_->format}++ if $_->user =~ /bob/i } END { print for sort keys %bobs }' addresses
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
You said you wanted to "make files with different lists"? Email::Address can help with that, too:
while (<DATA>) {
for (Email::Address->parse($_)) {
push @{$categories{by_host}{$_->host}}, $_;
push @{$categories{bobs}}, $_ if $_->user =~ /bob/i
}
}
And then this would create a list of user names in files named after each address's hostname:
for my $host (keys $categories{by_host}) {
open my $hf, '>', "hosts.$host" or die $!;
for (@{$categories{by_host}{$host}}) {
print {$hf} $_->user, "\n"
}
close $hf
}
So, run on that last list:
$ cat hosts.springfield-amusement-park.com
bobette
bobbyMcBobberson
bob
$ cat hosts.yahoo.com
bob
bahb
bob
Upvotes: 0
Reputation: 385819
It's normally
print "$&\n";
So if you add a statement modifier, it becomes
print "$&\n" if /gmail/;
You are missing a quote ("
), and your if
is misplaced.
A bit simpler:
perl -nE'say grep /gmail/, /[\w\.\-]+@[\w\.\-]+\w+/g'
You can even do the deduping in Perl itself.
perl -MList::MoreUtils=uniq -nE'say uniq grep /gmail/, /[\w\.\-]+@[\w\.\-]+\w+/g'
Upvotes: 4
Reputation: 46187
You have significantly overcomplicated this...
perl -wne'print if /@.*gmail/' all_emails_extracted.csv
Or, even easier (but without Perl):
grep @.*gmail all_emails_extracted.csv
Upvotes: 2