FTP tape dataset in chunks



JES, JES2, JCL utilities, IDCAMS, Compile & Run JCLs, PROCs etc...

FTP tape dataset in chunks

Postby Prasanna G » Wed Apr 22, 2015 11:36 am

Hi

My requirement is to FTP a mainframe tape dataset to Linux server. The tape dataset has 400 million records. Is it possible to send the data in chunks, say 100,000 records, as one chunk and next 100,000 records as another chunk? The file at the receiving end will have Acct.Part.1, Acct.Part.2 etc as the file name for each of the chunks. This is to enable parallel processing. Also is there an option to specify a wait period as input. This is the time the process on mainframe waits before sending the next chunk.

Kindly let me know if my requirement is not clear.

Thanks
Prasanna G.
User avatar
Prasanna G
 
Posts: 71
Joined: Tue Apr 12, 2011 9:49 pm
Has thanked: 1 time
Been thanked: 0 time

Re: FTP tape dataset in chunks

Postby enrico-sorichetti » Wed Apr 22, 2015 12:40 pm

why not speak to both Your LOCAL and REMOTE Location network support people
we do not know about Your organisation setup, they do

four concurrent ftp transfers might put a severe strain on the communication <line> to the remote
cheers
enrico
When I tell somebody to RTFM or STFW I usually have the page open in another tab/window of my browser,
so that I am sure that the information requested can be reached with a very small effort
enrico-sorichetti
Global moderator
 
Posts: 3006
Joined: Fri Apr 18, 2008 11:25 pm
Has thanked: 0 time
Been thanked: 165 times

Re: FTP tape dataset in chunks

Postby steve-myers » Wed Apr 22, 2015 2:44 pm

You are asking 2 conceptually independent questions here.
  • Can FTP, all by its little self, separate a file transfer into "groups"

    No, it can't.
  • Can an FTP client process more than one file at a time.

    No, it can't. The client is basically a one process, one thread, function. A server, by definition, is a multiple thread function that can communicate with many FTP clients at a time.
There is no conceptual issue with running several FTP clients at a time all talking to a single FTP server.

I have tried, in the past, the equivalent of your idea, and it never seemed to work very well.

Breaking a large data transmission into "chunks" is not a bad idea, so that if there is a transmission problem with one "chunk" only the failing "chunk" needs to be rerun, not the entire transmission. In my experience, FTP on modern communication links rarely fails, so it's simply not worth the hassle.

In a past life I did customer support for a large ISV. From time to time our clients had to send us fairly large files. While we did have quite a few bad transmissions, it was almost always related to an error by the client. An example would be trying to send binary data as a text file, or sending a VB data set in binary. I don't recall any transfers aborted by a bad communication link. I was the local "expert" in trying to sort out these messes. About half the time, I was able to put humpty together. I could usually, for example, correctly deduce if a file represented SYS1.DUMPxx type data, or if it represented a file compacted by TRSMAIN or AMATERSE. My first two examples could not be corrected and had to be resent.
steve-myers
Global moderator
 
Posts: 2105
Joined: Thu Jun 03, 2010 6:21 pm
Has thanked: 4 times
Been thanked: 243 times

Re: FTP tape dataset in chunks

Postby Prasanna G » Wed Apr 22, 2015 3:17 pm

Hi Steve

Thanks for your inputs.
User avatar
Prasanna G
 
Posts: 71
Joined: Tue Apr 12, 2011 9:49 pm
Has thanked: 1 time
Been thanked: 0 time

Re: FTP tape dataset in chunks

Postby BillyBoyo » Wed Apr 22, 2015 3:50 pm

You are suggesting 100,000 records per transfer. That's 10 transfers per million records, and you have 400million records, so that is 4,000 transfers.

Mmm.... I suspect that would require automation.

What size are the records? Fixed- or variable-length? If those are an average of 80 bytes per records, that'd only be 8Mb per transfer, and I suspect "startup/breakdown" would outweigh any possible savings through "parallelisms".

Even doing a million records at a time, you'd end up with 400 processes which would need to be verified automatically.

Also bear in mind that you don't want to process the tape file multiple times. 4000 datasets is more than you can (probably) have as output (DDs) from one step.

How often are you transferring this data?

I'd "top'n'tail' the multiple files with a header and a trailer. Header to contain Logical file-name, environment-name, business/data-date, sequence number (if sequence of data is important). Trailer to include, at minimum, a record count.

Overall file header and trailer, same type of information on header, trailer to include number of individual files.

Process on the receiver then has to check all that when putting the data back together.

Without that, I'd not want to touch the received data with a barge pole.
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times


Return to JCL

 


  • Related topics
    Replies
    Views
    Last post