user2870
user2870

Reputation: 487

Extracting words from file but each word once

I want to write a perl program which reads the file and extracts the dates in it. However if a date passes more than one times I will print it only once. For example:

On 01/10/2011 I went home. On 02/02/2012, I
went to my school. On 02/02/2012, I went
to London.

The output should be:

01/10/2011
02/02/2012

I can do it by adding the dates to an array and control it in every time I read an new date. But I am asking for a more efficient way. Is there a logical way to do it? or any data structure in perl?

Upvotes: 3

Views: 128

Answers (2)

G. Cito
G. Cito

Reputation: 6378

If you are open to installing a module to do this (I know it seems like overkill) List::MoreUtils has a uniq method. Everyone avert your eyes ... it's Friday afternoon, very hot and possibly time to slurp(-0777) beer:

perl -'MList::MoreUtils qw(uniq)' -0777nE '@dates = m|(\d\d/\d\d/\d{4})|xg ; @x = uniq(@dates); say "@x" ' file.txt

Sorry ;-)

Upvotes: 0

mpapec
mpapec

Reputation: 50637

It will scan line by line looking for dates in \d\d/\d\d/\d{4} format and save them in hash as keys.

When file reading is done, it prints these unique keys.

perl -nE '$s{$_}++ for m| (\d\d/\d\d/\d{4}) |xg;}{say for sort keys %s' file

It can be translated to more readable form (plus some checks)

use strict;
open my $fh, "<", "file" or die $!;

my %s;
while (my $line = <$fh>) {

  my @dates = $line =~ m| (\d\d/\d\d/\d{4}) |xg;

  for my $date (@dates) {
    $s{$date} += 1;
  }
}

for my $date (sort keys %s) {

  print $date, "\n";
}

Upvotes: 2

Related Questions