input file delimited by comma



Support for OS/VS COBOL, VS COBOL II, COBOL for OS/390 & VM and Enterprise COBOL for z/OS

input file delimited by comma

Postby Manju Venkat » Tue Mar 15, 2011 7:30 pm

I have an input file which is delimited by commas. Some fields in the input file will have double quotes and within that it can have comma. For those fields I should not delimit by comma. And I have to remove that double quotes in my output file. Please suggest me some logic.
input file format:
aaaaaa,bbbbb,"ccc,ccc",ddd,ee,"ff,ff",

output file format should be as follows.

aaaaaa bbbb ccc,ccc ddd ee ff,ff

can anyone please suggest the code?
Manju Venkat
 
Posts: 16
Joined: Tue Mar 15, 2011 9:28 am
Has thanked: 0 time
Been thanked: 0 time

Re: input file delimited by comma

Postby prino » Tue Mar 15, 2011 7:44 pm

Scan the record and remember when you're in a quote delimited field.
Robert AH Prins
robert.ah.prins @ the.17+Gb.Google thingy
User avatar
prino
 
Posts: 641
Joined: Wed Mar 11, 2009 12:22 am
Location: Vilnius, Lithuania
Has thanked: 3 times
Been thanked: 29 times

Re: input file delimited by comma

Postby BillyBoyo » Tue Mar 15, 2011 7:49 pm

Do you really want to remove all your field delimiters and leave it as one big string? If you have any possibility of blanks in any fields, you won't be able to tell what is what if you just remove the quotes and unquoted commas .
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times

Re: input file delimited by comma

Postby Manju Venkat » Tue Mar 15, 2011 7:54 pm

yes. i want to remove the commas and commas within double quotes should not be removed.requirement is like that.

but the output file is fixed format. means if any field is having spaces, it will move spaces to whole length of the field.so i think no problem will be there in the output file.
Manju Venkat
 
Posts: 16
Joined: Tue Mar 15, 2011 9:28 am
Has thanked: 0 time
Been thanked: 0 time

Re: input file delimited by comma

Postby BillyBoyo » Tue Mar 15, 2011 10:32 pm

OK then. Seems odd. Any chance of getting the creation of the file done not as a "CSV" just as a plain text file, like the output you are going to have to create otherwise? Then you'd just have the stuff you want, without the quotes to protect the embedded commas. Text, with the actual commas, and nothing else. Saves you writing the program.
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times

Re: input file delimited by comma

Postby BillyBoyo » Tue Mar 15, 2011 10:52 pm

So, you want a loop. One subscript/index for the input, one for the output. Flag to ignore a comma. Flag for opening or closing quote (set to expect opening quote to start with). Look at character on input. If quote and quote-flag expecting open quote, turn off comma flag, quote-flag to expecting closing quote, ignore input byte. If quote and quote-flag expecting close, turn on comma flag, quote flag to expecting open, ignore input byte. If comma and ignore comma is on, ignore input byte. Otherwise copy byte to output. You will, of course, have to determine the end of your record, but you have given no details on that so I assume you can handle it.
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times

Re: input file delimited by comma

Postby Quasar » Tue Mar 15, 2011 11:41 pm

Reminds me of a class on automaton theory and turing machines. My professors used to say "create an automaton to accept all strings that start with a, end with b and contain. . ." . I surely miss those days, when I learnt about parsers.
Quasar Chunawala,
Software Engineer, Lives at Borivali, Mumbai
User avatar
Quasar
 
Posts: 102
Joined: Wed Nov 10, 2010 7:11 pm
Location: Borivali, Mumbai
Has thanked: 13 times
Been thanked: 2 times

Re: input file delimited by comma

Postby BillyBoyo » Wed Mar 16, 2011 5:51 am

Manju Venkat wrote:yes. i want to remove the commas and commas within double quotes should not be removed.requirement is like that.

but the output file is fixed format. means if any field is having spaces, it will move spaces to whole length of the field.so i think no problem will be there in the output file.


Reading this again, it is still not clear to me. "Yes" is the answer to my question, that you are not concerned about loosing the distinction between a field and a string of characters. Then in the second paragraph, you talks about fields.

For fields, I would suggest a different method.

Anyway, code outline as provided can still give you problems. Presumably this has come from some user application which can "export" a CSV. The problem is, if there is a human putting the data in, are they restricted to only messing you about with embedded commas? What about embedded quotes? Such a thing would mess up that code, whether or not in a field bounded by quotes.

Without full knowledge of the possible inputs, the program code is more complicated. If a quote can occur in the data (if it is user typing, it will occur unless prevented) you have to also know that only quotes that are in an expected delimiting position should be treated as delimiters. Except, what about a necessary quote in the first position? And then, if you are looking for quotes as delimiters (so, something like ", or ,") what if one of those combinations occurs in the text? So in the end, you have to start from the beggining and make fields, also start from the end of the line and make other fields, and see if they are the same, and decide what to do if not.

As I have said, much simpler just to get a "text" file exported instead of the CSV, if at all possible. If not, you need a full specification of the possible data. Then maybe we can see again.
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times

Re: input file delimited by comma

Postby Quasar » Wed Mar 16, 2011 9:03 am

Hi -

Here's a code snippet for your help.

DATA DIVISION.                                                 
WORKING-STORAGE SECTION.                                       
01  WS-STRING                        PIC X(38)                 
                 VALUE "aaaaaa,bbbbb,'ccc,ccc',ddd,ee,'ff,ff',".
01  WS-I                             PIC S9(04) COMP-3         
                                     VALUE 0.                   
01  WS-CHARACTER                     PIC X.                     
01  WS-QUOTE-FLAG                    PIC X.                     
    88 QUOTE-ON                      VALUE "'".                 

PROCEDURE DIVISION.                                 
    PERFORM VARYING WS-I FROM 1 BY 1 UNTIL WS-I > 38
       MOVE WS-STRING(WS-I:1) TO WS-CHARACTER       
                                                   
       IF QUOTE-ON                                 
          IF WS-CHARACTER = "'"                     
             MOVE SPACES TO WS-CHARACTER           
          END-IF                                   
       ELSE                                         
          IF WS-CHARACTER = ","                     
             MOVE SPACES TO WS-CHARACTER           
          END-IF                                   
          IF WS-CHARACTER = "'"                     
             MOVE SPACES TO WS-CHARACTER           
             SET QUOTE-ON TO TRUE                   
          END-IF                                   
       END-IF                                       
    END-PERFORM                                     


Thank you very much.
Quasar Chunawala,
Software Engineer, Lives at Borivali, Mumbai
User avatar
Quasar
 
Posts: 102
Joined: Wed Nov 10, 2010 7:11 pm
Location: Borivali, Mumbai
Has thanked: 13 times
Been thanked: 2 times

Re: input file delimited by comma

Postby Quasar » Wed Mar 16, 2011 9:05 am

You also need to add turn the QUOTE-ON to FALSE when a second quote is encountered. Forgot that possibility.
IF QUOTE-ON
IF WS-CHARACTER = " ' "
MOVE SPACES TO WS-CHARACTER
QUOTE-FLAG
END-IF
Quasar Chunawala,
Software Engineer, Lives at Borivali, Mumbai
User avatar
Quasar
 
Posts: 102
Joined: Wed Nov 10, 2010 7:11 pm
Location: Borivali, Mumbai
Has thanked: 13 times
Been thanked: 2 times

Next

Return to IBM Cobol

 


  • Related topics
    Replies
    Views
    Last post