Praveen kumar
Praveen kumar

Reputation: 11

File renaming based on file content in UNIX

I have pattern namely QUARTERDATE and FILENAME inside the file. Both will have some value as in below eg.

My requirement is, I should rename the file like FILENAME_QUARTERDATE.

My file(myfile.txt) will be as below:

        QUARTERDATE:    03/31/14 - 06/29/14
        FILENAME   :    LEAD
field1  field2
34567
20.0    5,678
20.0    5,678
20.0    5,678
20.0    5,678
20.0    5,678

I want the the file name to be as LEAD_201402.txt Date range in the file is for Quarter 2, so i given as 201402.

Thanks in advance for the replies.

Upvotes: 0

Views: 99

Answers (2)

Jonathan Leffler
Jonathan Leffler

Reputation: 753990

How is a quarter defined?

As noted in comments to the main question, the problem is as yet ill-defined.

What data would appear in the previous quarter's QUARTERDATE line? Could Q1 ever start with a date in December of the previous year? Could the end date of Q2 ever be in July (or Q1 in April, or Q3 in October, or Q4 in January)? Since the first date of Q2 is in March, these alternatives need to be understood. Could a quarter ever start early and end late simultaneously (a 14 week quarter)?

To which the response was:

QUARTERDATE of Q2 will start as 1st Monday of April and end as last Sunday of June.

Which triggered a counter-response:

2014-03-31 is a Monday, but hardly a Monday in April. What this mainly means is that your definition of a quarter is, as yet, not clear. For example, next year, 2015-03-30 is a Monday, but 'the first Monday in April' is 2015-04-06. The last Sunday in March 2015 is 2015-03-29. So which quarter does the week (Mon) 2015-03-30 to (Sun) 2015-04-05 belong to, and why? If you don't know (both how and why), we can't help you reliably.

Plausible working hypothesis

  • The lessons of Y2K have been forgotten already (why else are two digits used for the year, dammit!).
  • Quarters run for an integral number of weeks.
  • Quarters start on a Monday and end on a Sunday.
  • Quarters remain aligned with the calendar quarters, rather than drifting around the year. (There are 13 weeks in 91 days, and 4 such quarters in a year, but there's a single extra day in an ordinary year and two extra in a leap year, which mean that occasionally you will get a 14-week quarter, to ensure things stay aligned.)
  • The date for the first date in a quarter will be near 1st January, 1st April, 1st July or 1st October, but the month might be December, March (as in the question), June or September.
  • The date for the last date in a quarter will be near 31st March, 30th June, 30th September, 31st December, but the month might be April, July, October or January.
  • By adding 1 modulo 12 (values in the range 1..12, not 0..11) to the start month, you should end up with a month firmly in the calendar quarter.
  • By subtracting 1 modulo 12 (values in the range 1..12 again) to the end month, you should end up with a month firmly in calendar quarter.
  • If the data is valid, the 'start + 1' and 'end - 1' months should be in the same quarter.
  • The early year might be off-by-one if the start date is in December (but that indicates Q1 of the next year).
  • The end year might be off-by-one if the end date is in January (but that indicates Q4 of the prior year).

More resilient code

Despite the description above, it is possible to write code that detects the quarter despite any or all of the idiosyncrasies of the quarter start and end dates. This code borrows a little from Barmar's answer, but the algorithm is more resilient to the vagaries of the calendar and the quarter start and end dates.

#!/bin/sh

awk '/QUARTERDATE/ {
         split($2, b, "/")
         split($4, e, "/")
         if      (b[1] == 12) { q = 1; y = e[3] }
         else if (e[1] ==  1) { q = 4; y = b[3] }
         else
         {
             if (b[3] != e[3]) {
                 print "Year mismatch (" $2 " vs " $4 ") in file " FILENAME
                 exit 1
             }
             m = int((b[1] + e[1]) / 2)
             q = int((m - 1) / 3) + 1
             y = e[3]
         }
         quarter = sprintf("%.4d%.2d", y + 2000, q)
     }
     /FILENAME/ {
         print $3 "_" quarter
         # exit
     }' "$@"

The calculation for m adds the start month plus one to the end month minus one and then does integer division by two. With the extreme cases already taken care of, this always yields a month number that is in the correct quarter.

The comment in front of the exit associated with FILENAME allows testing more easily. When processing each file separately, as in Barmar's example, that exit is an important optimization. Note that the error message gives an empty file name if the input comes from standard input. (Offhand, I'm not sure how to print the error message to standard error rather than standard output, other than by a platform-specific technique such as print "message" > "/dev/stderr" or print "message" > "/dev/fd/2".)

Given this sample input data (semi-plausible start and end dates for 6 quarters from 2014Q1 through 2015Q2):

        QUARTERDATE:    12/30/13 - 03/30/14
        FILENAME   :    LEAD
        QUARTERDATE:    03/31/14 - 06/29/14
        FILENAME   :    LEAD
        QUARTERDATE:    06/30/14 - 09/28/14
        FILENAME   :    LEAD
        QUARTERDATE:    09/29/14 - 12/28/14
        FILENAME   :    LEAD
        QUARTERDATE:    12/29/14 - 03/29/15
        FILENAME   :    LEAD
        QUARTERDATE:    03/30/15 - 06/29/15
        FILENAME   :    LEAD

The output from this script is:

LEAD_201401
LEAD_201402
LEAD_201403
LEAD_201404
LEAD_201501
LEAD_201502

You can juggle the start and end dates of the quarters within reason and you should still get the required output. But always be wary of calendrical calculations; they are almost invariably harder than you expect.

Upvotes: 0

Barmar
Barmar

Reputation: 781088

newname=$(awk '/QUARTERDATE/ { split($4, d, "/"); 
                               quarter=sprintf("%04d%02d", 2000+d[3], int((d[1]-1)/3)+1); }
               /FILENAME/ { fn = $3; print fn "_" quarter; exit; }' "$file")
mv "$file" "$newname"

Upvotes: 1

Related Questions