Need the last entry while using XSUM in SYNCSORT JCL

Vikas picture Vikas · Aug 23, 2013 · Viewed 8.5k times · Source

I have some data like below:

This is not the actual data, but actual data is similar to this. And, the data comes in a file with 2 spaces between each field. No database is involved in input or output. I am using table format just to make it understandable.

 Name    Number   code
+---------------------+
 Albert  122234   xcc
 Robert  565435   rtd
 Robert  776567   iuy
 Robert  452890   yyt
 Stuart  776565   ter

In a file the data would look like..

Albert  122234  xcc
Robert  565435  rtd
Robert  776567  iuy
Robert  452890  yyt
Stuart  776565  ter

Now, I need to eliminate the duplicates using SYNCSORT. I can do this using XSUM, but I would get the following data:

 Name    Number   code
+---------------------+
 Albert  122234   xcc
 Robert  565435   rtd
 Stuart  776565   ter

But I need:

 Name     Number   code
+----------------------+
 Albert   122234   xcc
 Robert   452890   yyt
 Stuart   776565   ter

The last set of data has the last occurance of Robert in the output, while the former set has the first occurance.

So, is there any way to achieve this using XSUM...?

Answer

cschneid picture cschneid · Aug 23, 2013

Two steps, the first adds a sequence number and sorts by the name ascending and by the sequence number descending. This is to get the last record for each name to come first.

The second step sorts just by name with the EQUALS parameter, which says to keep the data in the same sequence as the input file in the case of duplicate values in the SORT fields. Then we use SUM FIELDS=NONE to eliminate duplicates.

//SORT1    EXEC PGM=SORT
//SORTIN   DD  *
ALBERT  122234  XCC
ROBERT  565435  RTD
ROBERT  776567  IUY
ROBERT  452890  YYT
STUART  776565  TER
//SYSIN    DD  *
 INREC FIELDS=(001:001,020,         * REGULAR INPUT DATA
               021:SEQNUM,4,BI)     * ADD A SEQUENCE NUMBER
 SORT  FIELDS=(001,008,CH,A,
               021,004,BI,D)
 OUTREC FIELDS=(001:001,020,        * ORIGINAL INPUT DATA
                080:C' ')           * PADDING
//SORTOUT  DD  DISP=(NEW,PASS),
//             AVGREC=U,
//             LRECL=80,
//             RECFM=FB,
//             SPACE=(80,(1000,100))
//SYSOUT   DD  SYSOUT=*
//*
//SORT2    EXEC PGM=SORT
//SORTIN   DD  DISP=(SHR,PASS),DSN=*.SORT1.SORTOUT
//SYSIN    DD  *
 SORT  FIELDS=(001,008,CH,A),EQUALS
 SUM FIELDS=NONE
//SORTOUT  DD  SYSOUT=*
//SYSOUT   DD  SYSOUT=*
//*