Reputation: 11

Splitting of Mainframe datasets based on header

I need to split a mainframe dataset into many datasets based on the hour from the time. The File Format is:

<Timestamp> First record
<data>Second record
<data>third record
<data>
<Timestamp>

Here, I am required to take the time stamp into consideration and split the dataset on hourly basis. Say all the records belonging to time greater than or equal to 23:00 PM would form a new file. Similarly for all other hours starting 01 AM to 12 PM, different files need to be available (24 files for 24 hours or how much ever available). It needs to be dynamic.

How can this be achieved easily through JCL?

Or is it possible only through Rexx or Cobol programming?

Including I/O Format:

Input :Input.data

<2016-03-31> <23:41>
data1
data2
data3
<2016-03-31> <22:41>
data1
data2
data3

Output:

All the records with the same Hours (HH in HH:MM:SS) need to be written to a file. This way for all the hours in the 24 Hour clock

O/P File : Output.Test.H23

<2016-03-31> <23:41>
data1
data2
data3

Output File : Output.test.H22

<2016-03-31> <22:41>
data1
data2
data3

Upvotes: 0

Answers (1)

Bill Woodger

Reputation: 13076

To get multiple output dataset from SORT, you use OUTFIL. You want 24 datasets, so you'd need 24 OUTFILs.

You can select what goes on each OUTFIL using INCLUDE= or OMIT=.

OUTFIL FNAMES=xxx01,
       INCLUDE =(start,length,CH,EQ,C'01')

24 of those, 24 DDnames in your JCL.

"start" and "length" are the start of the hour in the timestamp, and the length of it (presumably two). Change FNAMES, change the literal in the INCLUDE=.

Depending on the quality of your data, you could include a 25th OUTFIL with SAVE, which gets any records which have not appeared in at least one other OUTFIL.

To get information from a "header" onto all the data relating to that header you use IFTHEN=(WHEN=GROUP.

You need to define the start of the group, and you need to ensure that the start of the group cannot get an accidental "hit" in your data.

OPTION COPY
INREC IFTHEN=(WHEN=GROUP,
               BEGIN=(condition),
               PUSH=(column-to-push-to:15,2))

condition needs to be replaced with a valid logical-expression which identifies your header, reliably. 1,1,CH,EQ,C'<' may be enough, or you may need to combine further individual conditions with AND to identify you header. You have the <, >, - and : to work with. If that is not enough, you'd have to have extra code to identify a year and/or a time. If that is not enough, the length of the record (or the presence of space for a fixed-length record). If that is not enough, then you have data which looks like a header, and you are stuffed.

What about the column-to-push-to? That depends on whether your records are fixed-length or variable-length.

Fixed-length is easy. You make the column-to-push-to the column-number after the last byte of data on your record. This will extend the record. You have to later adjust for that.

Variable-length is more complex, because you need to extend the record at the beginning of the data (else you make all your variable-length records of fixed-length, which is pointless).

INREC IFTHEN=(WHEN=INIT.
                BUILD=(1,4,2X,5)),

That creates two bytes of space, the 2X, (two blanks) on each record. Then the data, to the end of the record, is copied (the 5 all on its own) to the next available column in the BUILD, which is seven. For variable-length records, it is required to include the Record Descriptor Word on each BUILD, so the 1,4. Once that is done, any change in length is done automatically by SORT.

INREC IFTHEN=(WHEN=INIT.
                BUILD=(1,4,2X,5)),
      IFTHEN=(WHEN=GROUP,
               BEGIN=(condition),
               PUSH=(5:15,2))

In the 24 INCLUDE=, you test the two bytes which have been "pushed", for the hour.

On each OUTFIL, you need to return the records to their original content (without the PUSHed value). BUILD=(1,original-length) for fixed-length records, BUILD=(1,4,7) for the variable-length records, where the 7 is "from column seven to the end of the record".

Upvotes: 2

Splitting of Mainframe datasets based on header

Answers (1)

Related Questions