Matching files with different length



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Matching files with different length

Postby pulcinella » Mon Feb 02, 2009 6:29 pm

I don't know if that question has been make..

I need maching two files with different length. The first file has 124 position; the second file has only 2 position. The first file has duplicate; the second file no. The first file is order by position 31; the second file is order by first position. I want generate a third file of 124 position with records are same...

file 1 (input) 124 position (order by position 31. Have duplicate)
------

wwwwwwwwwwwwwwwwWWWWWWWWWWWwww E afjahfj...
YYYYYYYYYYYYYYYYYYYYYYYYYYYYYY E yuyuyuy...
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA N xxxxxxx...
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB N yyyyyyy...
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC N xxxxxxx...
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPP U uuuuuuu...
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII R wwwwwww...
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ T aaaaaaa...
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEE T rrrrrrr...
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF U ddddddd...
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGG V aaaaaaa...
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH V bbbbbbb...
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDD X sssssss...


file 2 (input) 2 position (order by the first position. No duplicate. The position 2 are not important for the comparation)
------

NS
RS
VS

file 3 (output) 124 position
------

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA N xxxxxxx...
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB N yyyyyyy...
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC N xxxxxxx...
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII R wwwwwww...
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGG V aaaaaaa...
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH V bbbbbbb...

Thanks
pulcinella
 
Posts: 114
Joined: Mon Dec 10, 2007 10:18 pm
Has thanked: 0 time
Been thanked: 0 time

Re: mathing files

Postby Frank Yaeger » Mon Feb 02, 2009 11:34 pm

The first file is order by position 31


Would that be the position of the 'E' in the first record (it appears to be in position 32 so I just want to make sure)?

Is the RECFM of both files FB?

It looks like you want to select records from input file2 that have a match in position 31 for a character in position 1 of input file1 - is that correct?

What is the maximum number of records you expect in input file2 (10? 100? 1000? 10000? more?)
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: mathing files

Postby pulcinella » Tue Feb 03, 2009 1:16 pm

Hello Frank,

For understand better I separate the field with a space in the file1 (but the position of comparation is at position 31 - if ignore the space the field is at position 31).

The file1 are 112 (no 124) of lengh; the file2 are 2 of lengh. I was wrong

The files are both FB. Exactly, I want select records from input FILE1 compare this position 31 (file1) with the position 1 of the FILE2 and generate a third file with records of file1

I don't know what is the maximun number of records but i think that is over a million (I could have 4 million)

Thanks and excuse me
pulcinella
 
Posts: 114
Joined: Mon Dec 10, 2007 10:18 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Matching files with different length

Postby Frank Yaeger » Wed Feb 04, 2009 2:06 am

I don't know what is the maximun number of records but i think that is over a million (I could have 4 million)


I suspect you don't really have millions of records in File2 (the file with 2 characters) which is the file I asked about.

Let me ask it a different way. Does file2 (the file with 2 characters) have duplicates for the character in position 1? That is, can there, for example, be more than one record in file2 with N in position 1 like:

NS
NS
NA
NG

or would N only appear in position 1 in one record (or not at all)?

Obviously, the maximum number of unique characters in position 1 would be 256 (X'00'-X'FF'). So the only question is whether there are duplicates of those unique characters in position 1 of file2.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Matching files with different length

Postby pulcinella » Wed Feb 04, 2009 3:01 am

Hello Frank,

The file 1 (112 position) could have over a million records and have duplicate at 31 position (in my example the two first records has the "E" letter; the third, four and five records has the "N" letter duplicate)

The file 2 not have duplicate and has 2 position. The position 1 is unique. This file has 20 or 30 records. I only compare the the first position. The value is unique. It's not possible that find two records equals.

I want generate a third file of 112 position (the same positions of file 1). I compare the position 31 of file 1 with the position 1 of file 2. If is the same I write output file (with the record of file 1) until the position 31 changed (i read the file 1 until the position 31 are not the same that the position1 of file 2). When changed i read the next record of file 1 and the next record of file 2. ..

In my example, the first two records are not in the file 2 (the key "E" is not at the file2). The third, four and five record are in the file2 (the key "N" is at the file2).

Thank you
pulcinella
 
Posts: 114
Joined: Mon Dec 10, 2007 10:18 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Matching files with different length

Postby Frank Yaeger » Wed Feb 04, 2009 3:33 am

Here's a DFSORT/ICETOOL job that will do what you asked for:

//S1   EXEC  PGM=ICETOOL
//TOOLMSG   DD  SYSOUT=*
//DFSMSG    DD  SYSOUT=*
//IN1 DD DSN=...  input file1 (FB/112)
//IN2 DD DSN=...  input file2 (FB/2)
//CTL2CNTL DD DSN=&&C2,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD DSN=...  output file (FB/112)
//TOOLIN DD *
COPY FROM(IN2) USING(CTL1)
COPY FROM(IN1) TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
  OUTFIL FNAMES=CTL2CNTL,REMOVECC,
    HEADER1=('  INCLUDE FORMAT=CH,COND=(1,1,NE,1,1,OR,'),
    BUILD=(C'  31,1,EQ,C''',1,1,C''',OR,',80:X),
    TRAILER1=('  1,1,NE,1,1)')
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Matching files with different length

Postby pulcinella » Sun Feb 08, 2009 11:46 pm

Thank you very much Frank for your help. It's all right.

I would get another question... with the same patterns of the previous example, I would compare other positions. I mean

compare two file. The first have 124 positions, have duplicates and order by the position 17-18; the second file have 2 positions order by 1-2 position and not have duplicate. I need compare the position 17-18 for the file1 with the position 1-2 of the file2. Both files are RECFM=FB. I want generate a third file of 124 positions with records of file1 (position 17-18) are that the same at file2 (position 1-2). At the file1 I have more than 1 million records; at the file2 I have no more than 99 records...

file1
xxxxxxxxxxxxxxxx10.....................
yyyyyyyyyyyyyyyy10....................
aaaaaaaaaaaaaaaa10...............
bbbbbbbbbbbbbbbb13...............
cccccccccccccccc15.......
dddddddddddddddd15...........
ffffffffffffffff16..............
gggggggggggggggg16.........
hhhhhhhhhhhhhhhh16......

file2
10
11
13
14
16

file3
xxxxxxxxxxxxxxxx10.....................
yyyyyyyyyyyyyyyy10....................
aaaaaaaaaaaaaaaa10...............
bbbbbbbbbbbbbbbb13...............
ffffffffffffffff16..............
gggggggggggggggg16.........
hhhhhhhhhhhhhhhh16......

I don't know if I could be

//CTL1CNTL DD *
OUTFIL FNAMES=CTL2CNTL,REMOVECC,
HEADER1=(' INCLUDE FORMAT=CH,COND=(1,2,NE,1,2,OR,'),
BUILD=(C' 17,2,EQ,C''',1,2,C''',OR,',80:X),
TRAILER1=(' 1,2,NE,1,2)')
/*

Thank you very much and excuse me
pulcinella
 
Posts: 114
Joined: Mon Dec 10, 2007 10:18 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Matching files with different length

Postby pulcinella » Mon Feb 09, 2009 9:01 pm

Hello Frank

I tried the example as I think it might work and it is right but I would like to know can result in two separate files: output file1 which is equal and file2 which is not equal. This is:

file3 (output 1)
xxxxxxxxxxxxxxxx10.....................
yyyyyyyyyyyyyyyy10....................
aaaaaaaaaaaaaaaa10...............
bbbbbbbbbbbbbbbb13...............
ffffffffffffffff16..............
gggggggggggggggg16.........
hhhhhhhhhhhhhhhh16......

file4 (output2)
cccccccccccccccc15.......
dddddddddddddddd15...........

Thank you and excuse me
pulcinella
 
Posts: 114
Joined: Mon Dec 10, 2007 10:18 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Matching files with different length

Postby Frank Yaeger » Tue Feb 10, 2009 12:37 am

I tried the example as I think it might work and it is right


Please show me the complete job you used and I'll see if I can show you how to modify it.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Matching files with different length

Postby pulcinella » Tue Feb 10, 2009 12:59 am

I used the first solution that you said me:

//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN1 DD DSN=... input file1 (FB/124)
//IN2 DD DSN=... input file2 (FB/2)
//CTL2CNTL DD DSN=&&C2,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD DSN=... output file (FB/124)
//TOOLIN DD *
COPY FROM(IN2) USING(CTL1)
COPY FROM(IN1) TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
OUTFIL FNAMES=CTL2CNTL,REMOVECC,
HEADER1=(' INCLUDE FORMAT=CH,COND=(1,1,NE,1,1,OR,'),
BUILD=(C' 31,1,EQ,C''',1,1,C''',OR,',80:X),
TRAILER1=(' 1,1,NE,1,1)')
/*

and I modify CTL1CNTL by this:

//CTL1CNTL DD *
OUTFIL FNAMES=CTL2CNTL,REMOVECC,
HEADER1=(' INCLUDE FORMAT=CH,COND=(1,2,NE,1,2,OR,'),
BUILD=(C' 17,2,EQ,C''',1,2,C''',OR,',80:X),
TRAILER1=(' 1,2,NE,1,2)')
/*

because I need compare the position 17-2 (file1) with the position 1-2 (file2)... I probe it and It's correct, work it...

Now, I am intested to write and other file with the records are not the same. An output file (file1) with the records are the same and other output file (file2) with the records are not the same

Thanks you
pulcinella
 
Posts: 114
Joined: Mon Dec 10, 2007 10:18 pm
Has thanked: 0 time
Been thanked: 0 time

Next

Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post