MaMu
MaMu

Reputation: 1869

Finding & replacing all dates in file

I have the following file:

    20120127.221500.std|MT:63|ST:1.|ON:ABT.N|DRT:U|SEQ:862461707
      80 Bezahlt        : 55.04
      81 Bezahlt_Umsatz : 200
     281 Bezahlt_Zeit   : 22:00:02
     752 Quelle         : CTS OTC
      83 Umsatz_gesamt  : 5639295
     621 VWAP           : 54.984104
      26 Zeit           : 22:00:05

    20120127.232408.std|MT:63|ST:1.|ON:ABT.N|DRT:U|SEQ:862507497
      41 Schluss        : 55.02
     120 Schluss_Datum  : 27.01.2012

    20120128.011558.std|MT:63|ST:1.|ON:ABT.N|DRT:U|SEQ:862559511
      25 Datum          : 28.01.2012
      26 Zeit           : 01:01:30

I wish to find all dates (i.e. 27.01.2012, 28.01.2012) and replace the newest one(i.e. 28.01.2012) with today's date. I wish to replace all older dates with older dates. I show you an example, cause I think you can understand me so at best. Let assume today is 21.11.2012. I wish to replace 28.01.2012 with 21.11.2012, 27.01.2012 with 20.11.2012. If there was 26.01.2012 I'd like to replace it with 19.11.2012.

Anyone could give me clue how can I do it?

Maybe some hints how should an algorithm look like? I would love to do it in perl.

My problem is how can I determine the oldest date. I have began with something like:

  open F ,"<$file";
    my $content = do{local $/;<F> };
    if ($content =~ /BOERSEN : [N|Q]/)
    {
      $content =~ /(\d\d\.\d\d\.\d\d\d\d)/;
      my $d = $1;
      my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
      $year+= 1900;
      $mon +=1;
      $mon = sprintf("%02d", $mon);
      $content =~ s/(\d\d)\.\d\d\.\d\d\d\d/$1\.$mon\.$year/msgi;
      my @d = split (/\./, $d);
      $d = $d[2].$d[1];
      $content =~ s/$d(\d\d)/$year$mon$1/msgi;
    }

But it is not really what I want.

Upvotes: 3

Views: 185

Answers (4)

simbabque
simbabque

Reputation: 54333

I fooled around a bit and came up with this. It needs to read the complete input first, but then it works.

#!/usr/bin/perl
use strict; use warnings;
use DateTime;
use DateTime::Format::Strptime;

my $text = <<'TEXT';
foo 27.01.2012 27-01-2012
foo 28.01.2012 28-01-2012
foo 26.01.2012 26-01-2012
bar 10.07.2011 10-07-2011
TEXT

# Formatter to make DateTime objects
my $strp = DateTime::Format::Strptime->new(
    pattern   => '%d.%m.%Y',
);
my $today = DateTime->today; # we need that to calculate

# Get all the dates from the input and turn them into DateTime objects
my %dates = map { $_ => $strp->parse_datetime($_) }
    $text =~ m/(\d{2}\.\d{2}.\d{4})/gm;

# Determine the latest date (the one nearest to today) and clone it
my $max_date = (sort { DateTime->compare( @dates{$a, $b} ) } keys %dates )[-1];
$max_date = $dates{$max_date}->clone;

foreach my $date ( keys %dates ) {
    # The new value needs to have the same "distance" to today as the old one
    # had to the highest date from the input

    # Do that calculation and format it
    my $new_date = $strp->format_datetime(
        $today - ($max_date - $dates{$date}));
    # Needs \Q and \E because there are '.' in the date
    $text =~ s/\Q$date\E/$new_date/g;
}

Here's the output:

foo 22.11.2012 27-01-2012
foo 23.11.2012 28-01-2012
foo 21.11.2012 26-01-2012
bar 05.05.2012 10-07-2011

Upvotes: 3

Borodin
Borodin

Reputation: 126722

The Time::Piece module is satisfactory for this purpose, and it is a core module so shouldn't need installing.

This program grabs the current date and time, and then sets the time fields to zero by formatting it as a %d.%m.%Y string and reading it back in. Then it opens and reads through the log file, looking at all the dates and finding the latest one. The delta between the latest date in the file and the current date is calculated, and the file is rewound to the beginning and read again. This time every date has the calculated delta added to it and the string is replaced in the output.

use strict;
use warnings;

use Time::Piece ();
use Fcntl ':seek';

my $today = Time::Piece->new;
$today = Time::Piece->strptime($today->dmy('.'), '%d.%m.%Y');

open my $fh, '<', 'logfile.txt' or die $!;

my $latest = 0;

while (<$fh>) {
  if ( /:\s*(\d\d\.\d\d\.\d\d\d\d)/ ) {
    my $date = Time::Piece->strptime($1, '%d.%m.%Y');
    $latest = $date if $date > $latest;
  }
}

my $delta = $today - $latest;
seek $fh, 0, SEEK_SET;

while (<$fh>) {

  s{:\s*\K(\d\d\.\d\d\.\d\d\d\d)}{
    my $date = Time::Piece->strptime($1, '%d.%m.%Y');
    $date += $delta;
    $date->dmy('.');
  }eg;

  print;
}

output

20120127.221500.std|MT:63|ST:1.|ON:ABT.N|DRT:U|SEQ:862461707
  80 Bezahlt        : 55.04
  81 Bezahlt_Umsatz : 200
 281 Bezahlt_Zeit   : 22:00:02
 752 Quelle         : CTS OTC
  83 Umsatz_gesamt  : 5639295
 621 VWAP           : 54.984104
  26 Zeit           : 22:00:05

20120127.232408.std|MT:63|ST:1.|ON:ABT.N|DRT:U|SEQ:862507497
  41 Schluss        : 55.02
 120 Schluss_Datum  : 22.11.2012

20120128.011558.std|MT:63|ST:1.|ON:ABT.N|DRT:U|SEQ:862559511
  25 Datum          : 23.11.2012
  26 Zeit           : 01:01:30

Upvotes: 2

ErikR
ErikR

Reputation: 52039

Here are some pointers for manipulating the file:

open F ,"<$file";
my $content = do{local $/;<F> };
close(F);

my $DATE_RE = qr/((\dd)\.(\d\d)\.(\d\d\d\d))/;
my %jdate;
# Find all of the dates and convert them to date ordinals
while ($content =~ m/$DATE_RE/g) {
  $jdate{$1} ||= jdate($2, $3, $4);
}

# find the most recent date
my $latest;
for my $d (keys %jdate) {
  if (!$latest || $jdate{$latest} < $jdate{$d}) {
    $latest = $d
  }
}

# for each date $d, determine what to replace it with
my %replacement;
for my $d (keys %jdate) {
  $replacement{$d} = ...your code here...
}

# Replace all of the dates
$content =~ s/$DATE_RE/$replacement{$1}/ge;

# done!

The key is the function jdate(...) which converts a day-month-year into an integer. There a a lot modules on CPAN which can do this - e.g. Time::JulianDay.

To determine the date replacements, you can make use of the inverse_julian_day() function which converts a julian day ordinal into day-month-year triple, i.e. something like:

my ($y, $m, $d) = inverse_julian_day( $today_jd - ($jdate{$latest} - $jdate{$d}) );
$replacement{$d} = sprintf("%02d.%02d.%04", $d, $m, $y);

Upvotes: 1

Jonathan Leffler
Jonathan Leffler

Reputation: 753970

There are a lot of Date and Time modules on CPAN.

You're going to need to find one that can add N days to a date easily. It might be sufficient to use mktime and strftime from the POSIX module and strptime from the POSIX::strptime module.

You need to determine N by specifying the 'old date' that you want to become the current date. You calculate the difference between the two dates (the old date and the current date) in days, giving you an integer value N. Then for each date-line, extract the date portion, add N days to it, and rewrite the date portion with the new fake date.


You ask about determining the 'oldest' date. The format you show is based on ISO 8601 and that means that strings such as 20120127 can be sorted as strings or as numbers to give date order. You also appear to have a log file; in such files, the first date is usually the oldest and the last date is the newest as they are written sequentially in monotonically increasing time order.

Upvotes: 2

Related Questions