IBM Mainframe Forum

by **Geri** » Mon Nov 02, 2009 6:48 pm

I have to merge 2 datasets (same record layout). Each record has a key (10 Bytes) and the rest is data.
The first dataset holds some sort of base data. The second dataset delivers new data for this.

Now i want a new output which holds all records from both files except those which key is in both files. In this cases only the record of the second dataset should be written to the output.

This gives some sort of "add new" and "update existing" records logic.

Is there any standard tool that can manage this, or do i have to write a program?

I found the merge function of DFSORT, but it seems as this could only add all records of both files to the output.

Thanks for your help.
Geri

by **Frank Yaeger** » Mon Nov 02, 2009 10:04 pm

Please show an example of the records in each input file (relevant fields only) and what you expect for output. Explain the "rules" for getting from input to output. Give the starting position, length and format of each relevant field. Give the RECFM and LRECL of the input files. If file1 can have duplicates within it, show that in your example. If file2 can have duplicates within it, show that in your example.

Also, indicate which Sort product you're using (DFSORT, Syncsort, CA-Sort).

by **Geri** » Tue Nov 03, 2009 2:10 am

Hi!

>Please show an example of the records in each input file (relevant fields only) and what you expect for output.

The record structure is fairly simple and all the same for input and output files.
Key: 10 Byte Char
Data: 367 Byte Char

In detail this consists some calendar data. One datachar for each day of a year.

eg.

Select all

US 2008YNYNYNYN...
US 2009YNNNNNYN...
US/NYC2008NYNNYNNN...
US/NYC2009YYNNNNNY...

>Explain the "rules" for getting from input to output.

file1 is the base file with all valid calendar-records. Each key is unique.
file2 is an update-file for the calendar-data.

The output should hold the following records:
* all records which are only in file1
* all records which are only in file2
* records of file2 where key has an duplicate in file1

This means file2 can bring in new keys and should update existing key records.
Which comes down to the question: How can i handle duplicates and tell the utility (if there is one) that if a dup occurs i always want the record from file2?

Or is the other way round better? Take all records from file2, add file1 and skip duplicates?

>Give the starting position, length and format of each
>relevant field.

start: position 1
length: 10 byte key
format: char

>Give the RECFM and LRECL of the input files.

Fixed blocked, LRECL 377 byte
files are sorted ascending

>If file1 can have duplicates within it, show that in your example.
>If file2 can have duplicates within it, show that in your example.

no duplicates

>Also, indicate which Sort product you're using (DFSORT, Syncsort, CA-Sort).

I intended to use DFSORT. But actually i am looking for a utility which can handle this. If there is a "standard" solution for this i dont need to write a new program just to reinvent the wheel.

I hope this was understandable.

Thanks in advance!
Geri

by **Frank Yaeger** » Tue Nov 03, 2009 4:20 am

Based on what you've said, you can use a DFSORT/ICETOOL job like the following to do what you want. Note that the files are concatenated in the order input file2, then input file1.

Select all

//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//CON DD DSN=... input file2 (FB/377)
// DD DSN=... input file1 (FB/377)
//OUT DD SYSOUT=*
//TOOLIN DD *
SELECT FROM(CON) TO(OUT) ON(1,10,CH) FIRST

If you're not familiar with DFSORT and DFSORT's ICETOOL, I'd suggest reading through "z/OS DFSORT: Getting Started". It's an excellent tutorial, with lots of examples, that will show you how to use DFSORT, DFSORT's ICETOOL and DFSORT Symbols. You can access it online, along with all of the other DFSORT books, from:

http://www.ibm.com/support/docview.wss? ... g3T7000080

by **Geri** » Tue Nov 03, 2009 1:15 pm

Many thanks for your help! This works great!
I didnt realize the SELECT function is the one to do the job.

Greetings from Austria
Geri

IBM Mainframe Forum

Merge of Datasets

Merge of Datasets

Re: Merge of Datasets

Re: Merge of Datasets

Re: Merge of Datasets

Re: Merge of Datasets