C. de Haer
C. de Haer

Reputation: 143

Convert Dates with Missing Months and Dates to a Specific Format in Perl

I've only been using Perl for a week, so hoping someone can help on here.

The script, which I've had some help in writing, imports a tab delimited file into a hash, with one column containing a date stored as YYYYMMDD. This is outputted to a file as Day Month Year (e.g. 20180712 is printed as 12 July 2018). I have found a way to convert this on here at How can I change the date formats in Perl? as follows:

my $date = '20111230';
my @months = ('January','February','March','April','May','June','July','August','September','October','November','December');

if($date =~ m/^(\d{4})(\d{2})(\d{2})$/){
    print $3 . ' ' . $months[$2-1] . ' ' . $1;
}               

However, sometimes the date is stored as only a year and month, and on very rare occasions it is only the year. This is stored in the hash with zeroes replacing the day (and month if necessary). Hence I required 20180700 to be printed as July 2018, and 20180000 is printed as 2018.

I can modify the code to check if the last two characters are 00, and then only print the Month and Year, and likewise check if the last four characters are 0000 etc., but is there a more elegant approach.

Upvotes: 2

Views: 129

Answers (2)

zdim
zdim

Reputation: 66891

The format with 00 for the missing day/month is well defined but it encodes special cases which are inconsistent with the yyyymmdd format. I don't see how there can be an approach that avoids explicit tests for these special cases, where day/month is just left out.

I'd like to suggest to not pick through datetimes with a regex, as there are good modules for that work. Even as this example is simple as stated, jobs tend to evolve; also, there is nothing wrong with using a good tool even in the simple case.

Using the core module Time::Piece

use warnings;
use strict;
use feature 'say';

use Time::Piece;

my $d = shift || '20180712';

my $date = fmt_date($d);

say $date;

sub fmt_date {
    my ($date) = @_;         
    my ($yr, $mm, $dd) = grep { $_ != 0 } unpack "A4A2A2", $date;
    my $d_fmt;

    if ($yr and $mm and $dd) {
        $d_fmt = Time::Piece
            ->strptime($date, "%Y%m%d")
            ->strftime("%d %B %Y");
    }   
    elsif (not $dd and $mm) {
        $d_fmt = Time::Piece
            ->strptime($yr.$mm.'01', "%Y%m%d")
            ->strftime("%B %Y");
    }   
    elsif (not $mm) {
        $d_fmt = $yr 
    }   
    return $d_fmt;
}

I filter the list returned by unpack so to not have to deal with 00 strings; this way the corresponding variables would be undef, what can be tested for more simply.

The strptime returns a Time::Piece object on which strftime method is directly called, returning a string in the desired format. If there is more work to do with these dates, you can of course store the object in a variable, then form the string out of it and return both.

But this raises a design issue: what kind of a date should it be when day/mon aren't given? When working with dates a solution is often to set them to 01 and the application can then use just the part it wants.

This can be made more compact and perhaps "nicer" but I'd suggest to not worry about elegance when you must pick through a list of tests.

The other, bigger and far more rounded option for date-time processing is DateTime module.


  For example

sub fmt_date {
    my ($date) = @_;
    my ($yr, $mm, $dd) = grep { $_ != 0 } unpack "A4A2A2", $date;

    my $dt_obj = Time::Piece->strptime(
        $yr . ($mm // '01') . ($dd // '01'), "%Y%m%d"  # legit format
    );

    my $d_fmt = do {
        if ($yr and $mm and $dd) { $dt_obj->strftime("%d %B %Y") }
        elsif (not $dd and $mm)  { $dt_obj->strftime("%B %Y")    }
        elsif (not $mm)          { $dt_obj->strftime("%Y")       }  # or, $yr 
    };  

    return wantarray ? ($d_fmt, $dt_obj) : $d_fmt;
}

where wantarray knows the calling context and so this can now be called either as

my ($date, $obj) = fmt_date($d);

or as

my $date = fmt_date($d);

depending on whether the caller wants the object for further work or not.

Upvotes: 3

Perl Ancar
Perl Ancar

Reputation: 620

zdim suggested using unpack(), which is not much better than the regex in this case. So I would say that the original solution is already OK; you just need to complete it by adding a bit of code, something like:

my $date = '20111230';
my @months = ('January','February','March','April','May','June','July','August','September','October','November','December');

if ($date =~ m/^(\d{4})(\d{2})(\d{2})$/){
    print ($3 > 0 ? $3 . ' ' : '') . ($2 > 0 ? $months[$2-1] . ' ' : '') . $1;
} else {
    die "Invalid date: $date";
}

Upvotes: 0

Related Questions