DFSORT/GROUP/JOINKEYS/ICETOOL

IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Grahamm » Tue Oct 19, 2010 10:18 pm

Hi there

Can you help me or suggest ways to tackle another problem I have.

I need to create a single output file from 3 files.
They all are FB 100 bytes records, and I need to retain the sequence on output.

Checking File 1 and File 2, I need to ignore duplicates.
But, if details exist on File 3 in either File 1 or File 2 then I need to ignore it completely.

One problem is that the records on File 1, 2 and 3 are grouped into a Header and numerous data records.
Also, there isn't any guarantee that the number of data records matches between files, ie 2 could be on File A but only one data record on File 2.

I started looking at various ways to achieve this but faioled when it came to removing duplicates with Grouped records.

For all 3 files:-
The header record starts 'K' in col 1; the key is 18 long and starts in col 8
The data record starts 'D' in col 1, some may or may not have a date following

e.g.

File1
K000001AAAAAAAAAAAAAAAAAA etc, etc
D14-11-2001A111 etc, etc
D18-11-2001A111 etc, etc
K000002BBBBBBBBBBBBBBBBBBBB etc, etc
D04-12-2001A788 etc, etc
D Y666 etc, etc
K000003CCCCCCCCCCCCCCCCCC etc, etc
D11-10-2001A345 etc, etc
D08-11-2001A222 etc, etc
K000005EEEEEEEEEEEEEEEEEEEE etc, etc
D11-10-2001A345 etc, etc
D08-11-2001A222 etc, etc


File2
K000002BBBBBBBBBBBBBBBBBBBB etc, etc
D04-12-2001B788 etc, etc
D Y666 etc, etc
K000003CCCCCCCCCCCCCCCCCC etc, etc
D11-10-2001U345 etc, etc
D08-11-2001U222 etc, etc
K000004DDDDDDDDDDDDDDDDDD etc, etc
D11-10-2001H345 etc, etc
D08-11-2001H222 etc, etc
K000005EEEEEEEEEEEEEEEEEEEE etc, etc
D11-10-2001K345 etc, etc
D08-11-2001U222 etc, etc



File3
K000005EEEEEEEEEEEEEEEEEEEE etc, etc
D11-10-2001K345 etc, etc
D08-11-2001U222 etc, etc


Output:-

K000001AAAAAAAAAAAAAAAAAA etc, etc (from FILE 1)
D14-11-2001A111 etc, etc
D18-11-2001A111 etc, etc
K000002BBBBBBBBBBBBBBBBBBBB etc, etc (from FILE 2)
D04-12-2001B788 etc, etc
D Y666 etc, etc
K000003CCCCCCCCCCCCCCCCCC etc, etc (from FILE 2)
D11-10-2001U345 etc, etc
D08-11-2001U222 etc, etc

Any help or pointers will be most appreciated.

:?
Grahamm
 
Posts: 10
Joined: Tue Oct 19, 2010 8:30 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Frank Yaeger » Tue Oct 19, 2010 11:10 pm

Your description of what you want to do is difficult to follow and doesn't appear to include the "rules" you want to use for all of the possible variations. It isn't clear if your "duplicates" refer to groups of records or individual records or what, or how groups of records in one file relate to groups in another file. Unless, you can do a better job of explaining the rules for what you want to do with a more extensive example of input and output, I can't help you.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1082
Joined: Sat Jun 09, 2007 8:44 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Grahamm » Wed Oct 20, 2010 1:26 am

Ok, I hope this is better explained.
I made some data up last time hoping it would be clearer, this is a better sample.

I have 3 similar length Files all 100 bytes long, but for the sake of my test data supplied lets say they are 50 bytes long.
I need to compare and create a single output file from the 3 input files.

In each file the records are grouped together by a Header (col 1 = K) and one, or many, data records (col 1 = D).

The header record starts letter K in col 1; the key is 18 long and starts in col 14.
The data record starts letter D in col 1.
A group of records is thus identified by K in column 1 and followed by one or more records starting D in column 1.

e.g.
I have highlighted all the K headers below, and Key fields.

File1
KLQ0147LQ0147001500751430004371 176/2345/556436
D15-11-2010[0000000000003825D
KLQ0147LQ0147001500751430004992 176/2345/556436
D15-11-2010[0000000000001194B
KLQ0147LQ0147001500751430007790 176/2345/556436
D20-09-2010[0000000000000302F
D20-10-2010[0000000000000302F
D [0000000000000302F
KLQ0147LQ0147001500751430008566 176/2345/556436
D15-11-2010[0000000000000956D
KLQ0147LQ0147001500751430011146 176/2345/556436
D16-01-2010[0000000000000164H
D16-02-2010[0000000000000164H
D [0000000000000164H
KLQ0147LQ0147001500751430013774 176/2345/556436
D15-11-2010[0000000000001351G
KLQ0147LQ0147001500751430024462 176/2345/556436
D15-11-2010[0000000000000794C
KLQ0147LQ0147001500751430029716 176/2345/556436
D06-11-2010[0000000000002083I
D06-12-2010[0000000000002083I
D [0000000000002083I
KLQ0147LQ0147001503281430003126 176/2345/556436
D15-11-2010[0000000000001122G
KLQ0147LQ0147001503281430011247 176/2345/556436
D15-11-2010[0000000000001024I
KLQ0147LQ0147001503281430011462 176/2345/556436
D15-11-2010[0000000000001582I


File2
KLQ0147LQ0147001500751430007790 177/2345/556436
D20-09-2010[0000000000000302F
D20-10-2010[0000000000000302F
D [0000000000000302F
KLQ0147LQ0147001500751430008566 177/2345/556436
D15-11-2010[0000000000000956D
KLQ0147LQ0147001500751430008661 177/2345/556436
D15-11-2010[0000000000001912G
KLQ0147LQ0147001500751430013774 177/2345/556436
D15-11-2010[0000000000001351G
KLQ0147LQ0147001503281430003126 177/2345/556436
D15-11-2010[0000000000001122G
KLQ0147LQ0147001503281430011247 177/2345/556436
D15-11-2010[0000000000001024I


File3
KLQ0147LQ0147001500751430024462 179/2345/556436
D15-11-2010[0000000000000794C
KLQ0147LQ0147001503281430003126 179/2345/556436
D15-11-2010[0000000000001122G
KLQ0147LQ0147001503281430011247 179/2345/556436
D15-11-2010[0000000000001024I


Output file:-

KLQ0147LQ0147001500751430004371 176/2345/556436 (as only on File 1)
D15-11-2010[0000000000003825D
KLQ0147LQ0147001500751430004992 176/2345/556436 (as only on File 1)
D15-11-2010[0000000000001194B
KLQ0147LQ0147001500751430007790 177/2345/556436 (as on File 1 and File 2, take File 2)
D20-09-2010[0000000000000302F
D20-10-2010[0000000000000302F
D [0000000000000302F
KLQ0147LQ0147001500751430008566 176/2345/556436 (as only on File 1)
D15-11-2010[0000000000000956D
KLQ0147LQ0147001500751430008661 176/2345/556436 (as only on File 2)
D15-11-2010[0000000000001912G
KLQ0147LQ0147001500751430011146 176/2345/556436 (as only on File 1)
D16-01-2010[0000000000000164H
D16-02-2010[0000000000000164H
D [0000000000000164H
KLQ0147LQ0147001500751430013774 177/2345/556436 (as on File 1 and File 2, take File 2)
D15-11-2010[0000000000001351G
KLQ0147LQ0147001500751430029716 176/2345/556436 (as only on File 1)
D06-11-2010[0000000000002083I
D06-12-2010[0000000000002083I
D [0000000000002083I
KLQ0147LQ0147001503281430011247 176/2345/556436 (as only on File 1)
D15-11-2010[0000000000001024I


All the following were Discarded as Keys existed on File 3:

KLQ0147LQ0147001500751430024462 176/2345/556436 (deleted as on File 1 and File 3)
D15-11-2010[0000000000000794C
KLQ0147LQ0147001503281430003126 176/2345/556436 (deleted as on File 1, File 2 and File 3)
D15-11-2010[0000000000001122G
KLQ0147LQ0147001503281430011462 176/2345/556436 (deleted as on File 2 and File 3)
D15-11-2010[0000000000001582I

I do hope that makes more sense?

The only thought I have is that I may have to achieve this via 2 passes?
:?
Grahamm
 
Posts: 10
Joined: Tue Oct 19, 2010 8:30 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Grahamm » Wed Oct 20, 2010 1:46 am

Just to clarify further,

1) if the Header record exists in File 1 only, output File 1 group of records;
2) if the Header record exists in File 2 only, output File 2 group of records;
3) if the Header record exists in File 1 and File 2, output File 2 group of records and discard File 1 details;
4) do not output any group of records from either File 1 or File 2 if header exists in File 3 as well
Grahamm
 
Posts: 10
Joined: Tue Oct 19, 2010 8:30 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Frank Yaeger » Wed Oct 20, 2010 2:53 am

Your expected output seems to be inconsistent with your rules.

KLQ0147LQ0147001503281430011247 176/2345/556436 (as only on File 1)

I see this key on all three files.

KLQ0147LQ0147001503281430011462 176/2345/556436 (deleted as on File 2 and File 3)

I see this key only on file1.

Why the inconsistencies?
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1082
Joined: Sat Jun 09, 2007 8:44 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Grahamm » Wed Oct 20, 2010 11:58 am

Hi there, sorry you of course are completely correct, my example wasn't completely correct
:oops:

This was typing error on my side so have re-supplied files and expected output again below.

The rules remain the same, and altered test data as below


e.g.
I have highlighted all the K headers below, and Key fields.

File1
KLQ0147LQ0147001500751430004371 176/2345/556436
D15-11-2010[0000000000003825D
KLQ0147LQ0147001500751430004992 176/2345/556436
D15-11-2010[0000000000001194B
KLQ0147LQ0147001500751430007790 176/2345/556436
D20-09-2010[0000000000000302F
D20-10-2010[0000000000000302F
D [0000000000000302F
KLQ0147LQ0147001500751430008566 176/2345/556436
D15-11-2010[0000000000000956D
KLQ0147LQ0147001500751430011146 176/2345/556436
D16-01-2010[0000000000000164H
D16-02-2010[0000000000000164H
D [0000000000000164H
KLQ0147LQ0147001500751430013774 176/2345/556436
D15-11-2010[0000000000001351G
KLQ0147LQ0147001500751430024462 176/2345/556436
D15-11-2010[0000000000000794C
KLQ0147LQ0147001500751430029716 176/2345/556436
D06-11-2010[0000000000002083I
D06-12-2010[0000000000002083I
D [0000000000002083I
KLQ0147LQ0147001503281430003126 176/2345/556436
D15-11-2010[0000000000001122G
KLQ0147LQ0147001503281430011462 176/2345/556436
D15-11-2010[0000000000001582I


File2
KLQ0147LQ0147001500751430007790 177/2345/556436
D20-09-2010[0000000000000302F
D20-10-2010[0000000000000302F
D [0000000000000302F
KLQ0147LQ0147001500751430008661 177/2345/556436
D15-11-2010[0000000000001912G
KLQ0147LQ0147001500751430013774 177/2345/556436
D15-11-2010[0000000000001351G
KLQ0147LQ0147001503281430003126 177/2345/556436
D15-11-2010[0000000000001122G
KLQ0147LQ0147001503281430011247 177/2345/556436
D15-11-2010[0000000000001024I


File3
KLQ0147LQ0147001500751430024462 179/2345/556436
D15-11-2010[0000000000000794C
KLQ0147LQ0147001503281430003126 179/2345/556436
D15-11-2010[0000000000001122G
KLQ0147LQ0147001503281430011247 179/2345/556436
D15-11-2010[0000000000001024I


Output file:-

KLQ0147LQ0147001500751430004371 176/2345/556436 (as only on File 1)
D15-11-2010[0000000000003825D
KLQ0147LQ0147001500751430004992 176/2345/556436 (as only on File 1)
D15-11-2010[0000000000001194B
KLQ0147LQ0147001500751430007790 177/2345/556436 (as on File 1 and File 2, take File 2)
D20-09-2010[0000000000000302F
D20-10-2010[0000000000000302F
D [0000000000000302F
KLQ0147LQ0147001500751430008566 176/2345/556436 (as only on File 1)
D15-11-2010[0000000000000956D
KLQ0147LQ0147001500751430008661 176/2345/556436 (as only on File 2)
D15-11-2010[0000000000001912G
KLQ0147LQ0147001500751430011146 176/2345/556436 (as only on File 1)
D16-01-2010[0000000000000164H
D16-02-2010[0000000000000164H
D [0000000000000164H
KLQ0147LQ0147001500751430013774 177/2345/556436 (as on File 1 and File 2, take File 2)
D15-11-2010[0000000000001351G
KLQ0147LQ0147001500751430029716 176/2345/556436 (as only on File 1)
D06-11-2010[0000000000002083I
D06-12-2010[0000000000002083I
D [0000000000002083I
KLQ0147LQ0147001503281430011462 176/2345/556436 (as only on File 1)
D15-11-2010[0000000000001582I


All the following were Discarded as Keys existed on File 3:

KLQ0147LQ0147001500751430024462 176/2345/556436 (deleted as on File 1 and File 3)
D15-11-2010[0000000000000794C
KLQ0147LQ0147001503281430003126 176/2345/556436 (deleted as on File 1, File 2 and File 3)
D15-11-2010[0000000000001122G
KLQ0147LQ0147001503281430011247 176/2345/556436 (deleted as on File 2 and File 3)
D15-11-2010[0000000000001024I

I do hope that makes more sense?
:)
Grahamm
 
Posts: 10
Joined: Tue Oct 19, 2010 8:30 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Frank Yaeger » Thu Oct 21, 2010 1:37 am

Hmmm ... I believe there's a way to do this with multiple passes (not sure how many), but I think it's more complex than I have the time for since it involves matches on groups of records in three different files. Perhaps it would be better/more efficient if you wrote your own program to do this.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1082
Joined: Sat Jun 09, 2007 8:44 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Grahamm » Thu Oct 21, 2010 4:21 am

Frank

Thanks for all your time and efforts.

I have been told that, on this occasion, we cannot write a program (due to management decisions - approvals, etc :x ) and that I need to find a JCL solution if possible. :?
I will keep playing with UNPAIRED JOIN's and see what I can manage, but if not then at least we tried :D

Thanks again
Grahamm
 
Posts: 10
Joined: Tue Oct 19, 2010 8:30 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Frank Yaeger » Thu Oct 21, 2010 6:10 am

Wow, this is more complicated than I thought because you said you can have different numbers of data records in the File1 and File2 group matches (you didn't actually show that in your example). I think I finally worked it out. See if this DFSORT/ICETOOL job does what you want:

Code: Select all
//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//IN1 DD DSN=...  input file1 (FB/50)
//IN2 DD DSN=...  input file2 (FB/50)
//IN3 DD DSN=...  input file3 (FB/50)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//X1 DD DSN=&&X1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//X2 DD DSN=&&X2,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD DSN=...  output file (FB/50)
//TOOLIN DD *
COPY FROM(IN2) TO(T1) USING(CTL1)
COPY FROM(IN1) TO(T1) USING(CTL2)
COPY JKFROM TO(X1) USING(CTL3)
SELECT FROM(X1) TO(X2) ON(51,18,CH) FIRST USING(CTL4)
COPY JKFROM TO(OUT) USING(CTL5)
//CTL1CNTL DD *
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(70:C'2')),
    IFTHEN=(WHEN=GROUP,BEGIN=(1,1,CH,EQ,C'K'),
      PUSH=(51:14,18,70:70,1))
//CTL2CNTL DD *
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(70:C'1')),
    IFTHEN=(WHEN=GROUP,BEGIN=(1,1,CH,EQ,C'K'),
      PUSH=(51:14,18,70:70,1))
//CTL3CNTL DD *
  JOINKEYS F1=IN3,FIELDS=(14,18,A),TASKID=A1
  JOINKEYS F2=T1,FIELDS=(51,18,A)
  JOIN UNPAIRED,F2,ONLY
  OPTION COPY
//A1F1CNTL DD *
  INCLUDE COND=(1,1,CH,EQ,C'K')
//CTL4CNTL DD *
  INCLUDE COND=(1,1,CH,EQ,C'K')
//CTL5CNTL DD *
  JOINKEYS F1=X2,FIELDS=(51,18,A,70,1,A),TASKID=A2
  JOINKEYS F2=X1,FIELDS=(51,18,A,70,1,A)
  REFORMAT FIELDS=(F2:1,50)
  OPTION COPY
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1082
Joined: Sat Jun 09, 2007 8:44 pm

Re: DFSORT/GROUP/JOINKEYS/ICETOOL

Postby Grahamm » Fri Oct 22, 2010 7:27 pm

Frank

Sorry for delay getting back, serious problems to deal with at this end :(

Coded the ICETOOL solution and of course it workked 100% first time :D 8-) Thanks.

I think I need to go through ICETOOL as didn't even look in that area, I did manage to scrape some dodgy code together using SORT that incorporated a couple of UNPAIRED JOINS & a SORT in the middle but nothing on the scale of this.
I fully intend going through ICETOOL now as not 100% certain on what areas of the solution are doing, although got a vague idea :?
Also, it will improve my knowledge of the functions and better awareness of what can be used for any future requests 8-) :lol:

That reminds me, need to submit another question :oops:

Excellent, many thanks again.
Grahamm
 
Posts: 10
Joined: Tue Oct 19, 2010 8:30 pm

Next

Return to DFSORT/ICETOOL/ICEGENER

cron