Was just going through my old mails when I came across this one issue (more aptly an observation) we'd come across while we were tweaking the DFSORT installation defaults for optimum memory usage during the batch window for a particular program using internal sort for one of the sites; this was an year or so back though. Posting here, out of curiosity; this might just help straighten those curves in my understanding of DFSORT.
We had a batch COBOL program which would take a KSDS as input, perform internal sort on a set of fields from the file; basis the sorted data, it'd segregate the records and then build a report. KSDS Key being the first 22 characters. Record Length of around 300; all records being fixed-length. The output report of record length 150 (o/p being a sequential file); the sorting happening on the key of KSDS and a few other fields, occurring at positions after the 100th column (I don't remember the exact specs though; my apologies); the report could have duplicate keys (by key, i mean the entire chunk of data on which SORT has been performed, for example:
SORT FIELDS=(1,36,CH,A)
The number of records in the i/p file would usually vary in the range of 1million to 6million on peak days.
Coming to the observation now: we'd noticed on several occasions, that whenever the i/p varied from 1million to 2.5 million, the sorted records would appear in same order irrespective of how many times the program was rerun (program reruns - in case of business requirement for regeneration of reports); whereas, when the i/p record count was large, the records were indeed sorted (basis the SORT key), but the remaining data (please read- data in remaining columns) would be arranged differently during every run; please refer below sample o/p; assuming sort has been run on first 6 columns, record arrangement variation has been shown (this is only a sample scenario as I do not have the actual production file with me as of now; the keys in i/p have been duplicated keeping in mind the rearrangement scenario):
I/P
123456FGDABCE
123456ABCDEFG
123456EFGDABC
123456CDEFGAB
Outputs:
----------
RUN1
123456ABCDEFG
123456CDEFGAB
123456EFGDABC
123456FGDABCE
RUN2
123456CDEFGAB
123456ABCDEFG
123456EFGDABC
123456FGDABCE
RUN3
123456ABCDEFG
123456EFGDABC
123456CDEFGAB
123456FGDABCE
Notice the change in record arrangement after the keys on which the records were sorted; this would happen only and only when the number of records in i/p file was fairly large, on the lines of 3 million or greater.
Curious as to why would SORT result in different record arrangements for same i/p, same sorting parameters, if SORT is 'in the end' running the same algorithm for sorting??
Unsure if anyone else has had similar observations; but if one has seen this happening before, and can share an insight, my understanding might just get better.
Cheers!