How to use ASCII condition in DFSORT



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

How to use ASCII condition in DFSORT

Postby Prasanna G » Tue Sep 20, 2022 10:15 pm

Hi Team

My requirement is to do the following:

1. Take one long sequential record from mainframe file..
2. Read one character at a time, if > ASCII 127, ignore, else keep.
3. Extract each sequence of contiguous ASCII characters as a separate word.
4. Reconstruct all the words into a sentence with spaces in between and write the output.
5. The final output will be a dataset of lines with only ASCII words.

Is it possible to achieve this using DFSORT?

Thank You
Regards
Prasanna G.
User avatar
Prasanna G
 
Posts: 71
Joined: Tue Apr 12, 2011 9:49 pm
Has thanked: 1 time
Been thanked: 0 time

Re: How to use ASCII condition in DFSORT

Postby sergeyken » Tue Sep 20, 2022 11:26 pm

IMHO, in that case REXX would be a better solution than any SORT tool.

SORT has very limited set of options to work with single bytes.

During the same time needed to build such sophisticated solution for SORT, one could create 5 to 10 similar solutions in REXX.
IMHO.
Javas and Pythons come and go, but JCL and SORT stay forever.
User avatar
sergeyken
 
Posts: 436
Joined: Wed Jul 24, 2019 10:12 pm
Has thanked: 7 times
Been thanked: 40 times

Re: How to use ASCII condition in DFSORT

Postby Prasanna G » Wed Sep 21, 2022 4:58 am

sergeyken wrote:IMHO, in that case REXX would be a better solution than any SORT tool.

SORT has very limited set of options to work with single bytes.

During the same time needed to build such sophisticated solution for SORT, one could create 5 to 10 similar solutions in REXX.
IMHO.


Hi Sergeyken

The files that I am going to deal with will have million of records. Hence I thought processing them using REXX will be time consuming and will be inefficient.
Any REXX gurus or SORT gurus can please provide your valuable suggestions.

Thank You
Regards
Prasanna G.
User avatar
Prasanna G
 
Posts: 71
Joined: Tue Apr 12, 2011 9:49 pm
Has thanked: 1 time
Been thanked: 0 time

Re: How to use ASCII condition in DFSORT

Postby prino » Wed Sep 21, 2022 1:39 pm

Prasanna G wrote:Any REXX gurus or SORT gurus can please provide your valuable suggestions.

Square pegs don't fit in round holes, use the right tools, write a program in PL/I (or if you're so inclined COBOL).

And by the way mainframe files are usually in EBCDIC, and they're called datasets!
Robert AH Prins
robert.ah.prins @ the.17+Gb.Google thingy
User avatar
prino
 
Posts: 641
Joined: Wed Mar 11, 2009 12:22 am
Location: Vilnius, Lithuania
Has thanked: 3 times
Been thanked: 29 times

Re: How to use ASCII condition in DFSORT

Postby sergeyken » Wed Sep 21, 2022 4:55 pm

Prasanna G wrote:2. Read one character at a time, if > ASCII 127, ignore, else keep.

Is it possible to achieve this using DFSORT?

This is definitely impossible in SORT.
It works with records of datasets, and never - with characters in files.
Javas and Pythons come and go, but JCL and SORT stay forever.
User avatar
sergeyken
 
Posts: 436
Joined: Wed Jul 24, 2019 10:12 pm
Has thanked: 7 times
Been thanked: 40 times

Re: How to use ASCII condition in DFSORT

Postby sergeyken » Wed Sep 21, 2022 4:57 pm

Prasanna G wrote:The files that I am going to deal with will have million of records. Hence I thought processing them using REXX will be time consuming and will be inefficient.

If so, I would recommend either C/C++, or Assembler.
Javas and Pythons come and go, but JCL and SORT stay forever.
User avatar
sergeyken
 
Posts: 436
Joined: Wed Jul 24, 2019 10:12 pm
Has thanked: 7 times
Been thanked: 40 times

Re: How to use ASCII condition in DFSORT

Postby sergeyken » Wed Sep 21, 2022 5:37 pm

Something like this, without unimportant details

#include <fstream>
#include <iostream>
#include <string>
using namespace std;
 
ifstream infile;
ofstream outfile;
string inline, outline;
. . . . . . . . . .

while ( getline( infile, inline) ) {
   outline = "";
   int len = inline.length();
   for ( int i = 0, j = 0; i < len; ) {
      while( i < len && inline[i] < 0x80 ) i++;
      for (j = i; j < len && inline[j] >= 0x80; ) j++ ;
      if (i < len)
         outline += inline.substr( i, j - i + 1 ) + " " ;
   }
   if (outline.length() > 0)
      outfile << outline << endl;
}

. . . . . . . . . .
 
Javas and Pythons come and go, but JCL and SORT stay forever.
User avatar
sergeyken
 
Posts: 436
Joined: Wed Jul 24, 2019 10:12 pm
Has thanked: 7 times
Been thanked: 40 times

Re: How to use ASCII condition in DFSORT

Postby sergeyken » Wed Sep 21, 2022 5:51 pm

1) My mistake - it should be
outline += inline.substr( i, j - i ) + " " ;


2) If performance is a real issue, I'd recommend to switch to pure C, without using C++ classes.
Javas and Pythons come and go, but JCL and SORT stay forever.
User avatar
sergeyken
 
Posts: 436
Joined: Wed Jul 24, 2019 10:12 pm
Has thanked: 7 times
Been thanked: 40 times

Re: How to use ASCII condition in DFSORT

Postby Prasanna G » Wed Sep 21, 2022 6:17 pm

Thanks Sergeyken.. I will try that out..
User avatar
Prasanna G
 
Posts: 71
Joined: Tue Apr 12, 2011 9:49 pm
Has thanked: 1 time
Been thanked: 0 time

Re: How to use ASCII condition in DFSORT

Postby sergeyken » Wed Sep 21, 2022 7:24 pm

In C, it may be like this

#include <stdio.h>
#include <stdlib.h>

#define MAX_LINE 1000
 
FILE *infile, *outfile;
unsigned char inline[MAX_LINE], outline[MAX_LINE];
. . . . . . . . . .

if ( NULL == (infile = fopen( "........", "r" ) ) ) exit(100);
if ( NULL == (outfile = fopen( ".......", "w" )) ) exit(200);

do {
   fgets( inline, sizeof(inline), infile );
   outline[0] ='\0';
   unsigned char *ichar, *jchar, *ochar;  
   for ( ichar = jchar = inline, ochar = outline; *ichar != '\0' ) {
      while( *ichar != '\0' && *ichar < 0x80 ) ichar++;
      for (jchar = ichar; *jchar >= 0x80; ) jchar++ ;
      if (*ichar != '\0') {
         int word_size = (jchar - ichar);
         strncpy( ochar, ichar, word_size) ;
         strcpy( (ochar += word_size), " " );
         ochar++;
      }
   }
   if ( outline[0] != '\0' ) {
      fputs( outline, outfile );
      fputc( '\n', outfile );    // because fputs() does not add EOL after the line...
   }
} while ( !eof(infile) );

fclose(infile);
fclose(outfile);

. . . . . . . . . .
 


Here, using the pointers char * instead of line indexes, or line-scanning functions like strlen(), strcat()... can significantly improve performance.
Javas and Pythons come and go, but JCL and SORT stay forever.
User avatar
sergeyken
 
Posts: 436
Joined: Wed Jul 24, 2019 10:12 pm
Has thanked: 7 times
Been thanked: 40 times

Next

Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post