Sorting and removing duplicates in a single sort step



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Sorting and removing duplicates in a single sort step

Postby drucky » Tue May 10, 2011 5:05 am

Hi,

Please see my requirements below:

1. I need to sort a file based on a key of 5 byte length.
2. In case there are records with duplicate keys I need to retain the one which has the most recent date-time stamp.

For example

11111 20110509 01:14:56
11111 20110509 02:00:00
22222 20110508 01:30:30
22222 20110509 07:15:00
33333 20110509 08:00:00
44444 20110509 09:00:00

Output File should be

11111 20110509 02:00:00
22222 20110509 07:15:00
33333 20110509 08:00:00
44444 20110509 09:00:00

The date and time fields are actually packed decimal fields,but in the example above i have mentioned it differently
for illustration purposes.

The Input File record length is 1000
Key position - 1; length 5 bytes
Date position - 6; packed decimal 9(09)
Time position - 11; packed decimal 9(09)

Can someone can suggest me how to achieve this using a single SORT card. I know we can achieve this using ICETOOL but I would
prefer if we did it using just DFSORT.

Please let me know if you need some more clarifications.

Thanks,
Drucky
drucky
 
Posts: 2
Joined: Sat May 07, 2011 5:06 am
Has thanked: 0 time
Been thanked: 0 time

Re: Sorting and removing duplicates in a single sort step

Postby Frank Yaeger » Tue May 10, 2011 6:45 am

Can someone can suggest me how to achieve this using a single SORT card. I know we can achieve this using ICETOOL but I would prefer if we did it using just DFSORT.


Why? To make life difficult? Since you want to SUM on one field, but need to SORT on several fields, you can't do this
using a single SORT card.
You can do it using a single ICETOOL SELECT pass as follows:

//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DSN=... input file
//OUT DD DSN=...  output file
//TOOLIN DD *
SELECT FROM(IN) TO(OUT) ON(1,5,CH) FIRST USING(CTL1)
/*
//CTL1CNTL DD *
  SORT FIELDS=(1,5,CH,A,6,5,PD,D,11,5,PD,D)
/*


Perhaps you weren't aware that ICETOOL's SELECT could do it in a single pass?

If you want to do this with DFSORT instead of DFSORT's ICETOOL for some reason, use two steps - first sort the records on the three fields to create a temp file, then SORT and SUM on the temp file. This will accomplish in two passes what you can do with ICETOOL in one pass. After all, why be efficient?
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Sorting and removing duplicates in a single sort step

Postby drucky » Tue May 10, 2011 10:20 am

Thank you Frank,

Unfortunately ICETOOL is not recommended in our organisation because of maintainability concerns. After all not everyone is aware of it's entire functionality, including me. I suppose i'll have to go with two SORT steps instead.

Thanks again for your suggestion.
drucky
 
Posts: 2
Joined: Sat May 07, 2011 5:06 am
Has thanked: 0 time
Been thanked: 0 time

Re: Sorting and removing duplicates in a single sort step

Postby archnXSP » Tue May 10, 2011 4:52 pm

Why not use SUM FIELDS=NONE in the second line of your sort card..?

But then again it won't work if you are using more than one field to SORT...




Regards,
Sam
Somewhere this world hides its source code...
User avatar
archnXSP
 
Posts: 2
Joined: Tue Mar 01, 2011 9:21 pm
Location: Pune
Has thanked: 0 time
Been thanked: 0 time

Re: Sorting and removing duplicates in a single sort step

Postby skolusu » Tue May 10, 2011 9:07 pm

archnXSP wrote:Why not use SUM FIELDS=NONE in the second line of your sort card..?

But then again it won't work if you are using more than one field to SORT...


archnxsp,

Are you contradicting your 1st statement with your second statement? Did you try to add SUM FIELDS=NONE and check if you got the desired results? What exactly did you want convey in the above post?

drucky ,

As Frank mentioned ICETOOL's SELLECT is the ideal choice for such requests. Here is a customized solution for your input RECFM=FB and LRECL=1000 using SORT.

//STEP0100 EXEC PGM=SORT                                       
//SYSOUT   DD SYSOUT=*                                         
//SORTIN   DD DSN=Your FB input 1000 byte file,DISP=SHR
//SORTOUT  DD SYSOUT=*                                         
//SYSIN    DD *                                               
  SORT FIELDS=(1,5,CH,A,6,5,PD,D,11,5,PD,D),EQUALS             
  OUTREC IFTHEN=(WHEN=GROUP,KEYBEGIN=(1,5),PUSH=(1001:SEQ=8)) 
  OUTFIL BUILD=(1,1000),INCLUDE=(1001,8,ZD,EQ,1)               
//*
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times

Re: Sorting and removing duplicates in a single sort step

Postby Frank Yaeger » Wed May 11, 2011 12:59 am

Unfortunately ICETOOL is not recommended in our organisation because of maintainability concerns. After all not everyone is aware of it's entire functionality, including me. I suppose i'll have to go with two SORT steps instead.


.soapbox on
This is a ridiculous statement. Do you think that anyone in your organization is actually aware of the entire functionality of DFSORT (including you)? If you are, then you must have done a lot of reading. ICETOOL, like DFSORT, is fully documented, so anyone can become aware of the functions of either one equally. ICETOOL has been part of DFSORT since 1991 - it's not exactly something new. It's just as easy to "maintain" ICETOOL by reading its documentation as it is to maintain DFSORT by reading its documentation. I just don't understand organizations that set crazy restrictions like this based on "laziness". Your organization is paying for the functionality in ICETOOL, so why not spend some time to take advantage of what you're paying for? Look at that SELECT operator - does it really seem complicated to you?
.soapbox off
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post