Hello,
A few questions before we look at the solution:
a. You do not want the records to be sorted while padding the ID? The output you've shown retains the original order of records.
b. Is there a possibility of a unique group record to appear again somewhere down the line, if so how do you want that handled; for example:
630774221
630774221
963495850
963495850
963495850
345695561
678609548
678609548
678609548
630774221 --> here this appears again
630774221 --> here this appears again
918367402
279702180
c. You've mentioned that there can be billions of records in input, but you've shown unique identifiers of 4 bytes only, which would mean that it can accommodate maximum of '9999' unique identifiers.
Solution to the query is fairly straight forward unless the aforementioned complexities are not added to it; you need to group the records and PUSH an ID to it. DFSORT allows 15 bytes zoned decimal id to be pushed in, which means 999,999,999,999,999 is the maximum value:
//SORTIN DD *
630774221
630774221
963495850
963495850
963495850
345695561
678609548
678609548
678609548
918367402
279702180
/*
//SORTOUT DD SYSOUT=*
//SYSIN DD *
SORT FIELDS=COPY
INREC IFTHEN=(WHEN=GROUP,KEYBEGIN=(1,9),
PUSH=(11:ID=15))
/*
Output:
630774221 000000000000001
630774221 000000000000001
963495850 000000000000002
963495850 000000000000002
963495850 000000000000002
345695561 000000000000003
678609548 000000000000004
678609548 000000000000004
678609548 000000000000004
918367402 000000000000005
279702180 000000000000006