Deuian
Deuian

Reputation: 841

DFSORT selecting duplicates when looking for only the first duplicate

The below JCL should select the first duplicate of each record, keeping them in the same order because of "OPTION COPY" and only with the 'NETWORK' at byte 4 length 7 and '.' at byte 59 length 1, excluding records with 'TOTAL' at byte 3 length 5 and 'GRAND' at byte 3 length 5.

It shows any record with 'NETWORK' at byte 4 length 7

//SORT EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DISP=SHR,DSN=INPUT.FILE
//T1       DD DSN=&&T1,DISP=(MOD,PASS),SPACE=(TRK,(5,5))
//OUT DD SYSOUT=*
//OUTFIL DD SYSOUT=*
//TOOLIN   DD *
* DROP EVERYTHING WE DON'T WANT
  SELECT FROM(IN)  TO(OUT) ON(1,134,CH) USING(CTL1) FIRST
/*
//CTL1CNTL DD *
  OPTION COPY
  INCLUDE COND=((4,7,CH,EQ,C'NETWORK',OR,
                 59,1,CH,EQ,C'.'),AND,
                 (3,5,CH,NE,C'TOTAL',AND,
                  3,5,CH,NE,C'GRAND'))
/*

If I change it the conditions for only 'NETWORK' at byte 4 length 7 it only shows 1 record, which is what I expect. The input is the same each time.

//CTL1CNTL DD *
  OPTION COPY
  INCLUDE COND=((4,7,CH,EQ,C'NETWORK'))
/*

I can't figure out what the difference is that causes the other conditions to change it so it has duplicates

2 of the comments have suggested that the issue is with the include conditions.

I have tried the below, the first select is doing what I was doing original and the second SELECT is without the include conditions because they have already happened in the first select. There are still duplicate records with NETWORK at byte 4 length 7. The rest of the record with NETWORK are the exact same so there should only be 1.

//TOOLIN   DD *
* DROP EVERYTHING WE DON'T WANT
  SELECT FROM(IN)  TO(T1) ON(1,133,CH) USING(CTL1) FIRST
  SELECT FROM(T1)  TO(OUT) ON(1,133,CH) USING(CTL2) FIRST
/*
//CTL1CNTL DD *
  OPTION COPY
    INCLUDE COND=((4,7,CH,EQ,C'NETWORK',OR,
                   59,1,CH,EQ,C'.'),AND,
                   (3,5,CH,NE,C'TOTAL',AND,
                    3,5,CH,NE,C'GRAND'))
/*
//CTL2CNTL DD *
  OPTION COPY
/*

Upvotes: 0

Views: 12536

Answers (1)

Deuian
Deuian

Reputation: 841

The SELECT FIRST operator expects the input to be sorted, which it does before checking for duplicates once you don't specify "OPTION COPY"

I wanted to remove the duplicates and keep it in input order.

The below does it by adding a sequence number that allows the temp file to be sorted back to input order

//TOOLIN   DD *
* SELECT REMOVING THE DUPLICATES AND ONLY INCLUDING THE FIELDS WANTED
* TO TEMP DD T1
  SELECT FROM(IN)  TO(T1) ON(1,133,CH) USING(CTL1) FIRST
* COPY FROM TEMP DD T1 TO DD OUT USING CTL2 STATEMENTS
  COPY FROM(T1) TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
  INCLUDE COND=((4,7,CH,EQ,C'NETWORK',OR,
                 59,1,CH,EQ,C'.'),AND,
                 (3,5,CH,NE,C'TOTAL',AND,
                  3,5,CH,NE,C'GRAND'))
* ADD SEQUENCE NUMBER 8 NUMBERS LONG TYPE SIGNED ZONED DECIMAL AT THE
* END OF EACH RECORD
  INREC OVERLAY=(134:SEQNUM,8,ZD)
/*
//CTL2CNTL DD *
* SORT ON THE SEQUENCE NUMBER WHICH PUTS THE RECORDS BACK IN INPUT
* ORDER
  SORT FIELDS=(134,8,CH,A)
/*

Upvotes: 3

Related Questions