Martijn
Martijn

Reputation: 31

COBOL generating XML-file with CDATA

I am trying to make a XML file in COBOL using the GENERATE statement. So far so good. But with this particular xml it needs to contain a seperate xml file within. So i want to use the CDATA tag around it. But, is there a way to do this in COBOL with the GENERATE statement?

Here an example.

   01    request.
         06    route.
         11    name                  PIC  X(030).
         11    version               PIC  9(004).
         06    question.
         11    IDENT                 PIC  9(009).
         11    xmlFileName           PIC  X(006).
         11    xmlFileInh            PIC  X(5000).

the xmlFileInh needs to be filled with another XML file. This can be only xml or a soap request.

Something like this:

<?xml version="1.0" encoding="UTF-8"?>
<request>
  <route>
    <name>serviceRequest</name>
    <version>1</version>
  </route>
  <question>
    <IDENT>111111111</IDENT>
    <xmlFileName>FILE-1</xmlFileName>
    <xmlFileInh>
       <![CDATA[<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope.....<SOAP-ENV:Envelope]]>
    </xmlFileInh>
  </question>
</request>

I have tried to STRING the "<![CDATA[" and "]]>" around the incoming XML-file and then put it in xmlFileInh. This does something, but renders all HTML control characters into something i don't want in my xml-file. The GENERATE statement does nothing with CDATA.

< becomes   &lt;
> becomes   &gt;
" becomes   &quot;
' becomes   &apos;
& becomes   &amp;

I also tried to give xmlFileInh another picture, even type XML. That gives a lot of new sorts of tags in my XML, name-length and data-length etc. but none i want.

Does anyone have a solution?

Thanks in advance Martijn.

Upvotes: 3

Views: 1327

Answers (3)

Martijn
Martijn

Reputation: 31

After reading the answer from @FredTheFlinstone i knew that was exactly what my situation needed. The XML generated with the embedded xml inside is being PARSEd by another COBOL program. So i used the solution without adding CDATA in front of end at the end of the embedded XML.

There are some extra things to consider here (in my case):

The XML to be put in XMLFILEINH is coming from MQ in UTF-8. The variables in REQUEST are in Working Storage so EBCDIC. The GENERATE needs to make the REQUEST-xml in UTF-8 so i added the ENCODING 1208. The GENERATE needs all fields in EBCDIC so i first have to translate the input with functions DISPLAY-OF and NATIONAL-OF.

Also make sure to initialize trailing characters in XMLFILEINH. Only spaces are being removed by the GENERATE statement. Obvious but good to know.

Last, about the underscores in tagnames starting with XML. I got no clue. I think it's because of the name 'XML' in it? This was just a trial request to clarify my question here. I use other words to generate my REQUEST, not with XML in it. There are no underscores.

If my request had to go outside the mainframe COBOL environment then i might had have to use the other option given here by @cschneid. I also will give the message here to our technicians who deal with IBM.

Allthough maybe, because excape characters seem to be standard xml usage, other parsers on other platforms deal with it the same. But that leaves the question why use CDATA at all.... It has to be useful for something.

Anyway, thanks for the answers! It got my problem solved.

Upvotes: 0

FredTheFlinstone
FredTheFlinstone

Reputation: 41

You may not need to use CDATA at all. XML GENERATE will take the content of XMLFILEINH and escape the special characters (as you have indicated). The resultant XML when viewed with a simple text editor will show escape sequences - not what you want. However, when you use XML PARSE to process it, the escaped characters will again be replaced with their original contents. Also, most XML aware viewers (e.g. Microsoft Edge among others) will display content as you expect without the escape sequences.

Here is an example IBM Enterprise COBOL 6.2 program illustrating my point:

  IDENTIFICATION DIVISION.
  PROGRAM-ID. XML5.
  DATA DIVISION.
  WORKING-STORAGE SECTION.

  01  REQUEST.
      06 ROUTE.
        11 NAME                  PIC  X(030).
        11 VERSION               PIC  9(004).
      06 QUESTION.
        11 IDENT                 PIC  9(009).
        11 XMLFILENAME           PIC  X(006).
        11 XMLFILEINH            PIC  X(5000).


  01  XML-DOC                    PIC X(5000).
  01  XML-IDX                    PIC S9(9) BINARY.
  01  XML-CHAR-CNT               PIC S9(9) BINARY.

  PROCEDURE DIVISION.
  MAINLINE SECTION.
      MOVE 'serviceRequest' TO NAME
      MOVE 1                TO VERSION
      MOVE 111111111        TO IDENT
      MOVE 'FILE-1'         TO XMLFILENAME
      MOVE '<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelop
 -         'e.....<SOAP-ENV:Envelope>'
        TO XMLFILEINH

      INITIALIZE XML-DOC
      XML GENERATE XML-DOC FROM REQUEST COUNT IN XML-CHAR-CNT
      PERFORM VARYING XML-IDX FROM 1 BY 80
                UNTIL XML-IDX > XML-CHAR-CNT
         DISPLAY XML-DOC (XML-IDX : 80)
      END-PERFORM

      XML PARSE XML-DOC PROCESSING PROCEDURE XML-HANDLER
          ON EXCEPTION
             DISPLAY 'XML Error: ' XML-CODE
             GOBACK
          NOT ON EXCEPTION
             DISPLAY 'ALL DONE.'
      END-XML
      GOBACK
      .

  XML-HANDLER.
      DISPLAY XML-EVENT (1:22) ':' XML-TEXT
      .

The output is:

<REQUEST><ROUTE><NAME>serviceRequest</NAME><VERSION>1</VERSION></ROUTE><QUESTION
><IDENT>111111111</IDENT><_XMLFILENAME>FILE-1</_XMLFILENAME><_XMLFILEINH>&lt;?xm
l version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&lt;SOAP-ENV:Envelope..
...&lt;SOAP-ENV:Envelope&gt;</_XMLFILEINH></QUESTION></REQUEST>
START-OF-DOCUMENT     :
START-OF-ELEMENT      :REQUEST
START-OF-ELEMENT      :ROUTE
START-OF-ELEMENT      :NAME
CONTENT-CHARACTERS    :serviceRequest
END-OF-ELEMENT        :NAME
START-OF-ELEMENT      :VERSION
CONTENT-CHARACTERS    :1
END-OF-ELEMENT        :VERSION
END-OF-ELEMENT        :ROUTE
START-OF-ELEMENT      :QUESTION
START-OF-ELEMENT      :IDENT
CONTENT-CHARACTERS    :111111111
END-OF-ELEMENT        :IDENT
START-OF-ELEMENT      :_XMLFILENAME
CONTENT-CHARACTERS    :FILE-1
END-OF-ELEMENT        :_XMLFILENAME
START-OF-ELEMENT      :_XMLFILEINH
CONTENT-CHARACTERS    :<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope.....<SOAP-ENV:Envelope>
END-OF-ELEMENT        :_XMLFILEINH
END-OF-ELEMENT        :QUESTION
END-OF-ELEMENT        :REQUEST
END-OF-DOCUMENT       :
ALL DONE.

Note escaping of special characters in the "raw" dump of the generated XML, but upon completion of XML PARSE they are restored to what was given to XML GENERATE. This is normal XML processing. Character escaping such as this may protect you from code-page conversions when transmitting the generated XML. When using CDATA there is a possibility of corruption when the document has to be converted from one code page to another and there is no direct mapping for a given character (not likely but possible).

What I find interesting here, and can't explain, is why the generated XML tag names beginning with XML are prefixed with an underscore.

Final note: If the content of COBOL variable XMLFILEINH contained the sequence </_XMLFILEINH> somewhere one might think that it would cause premature ending of the <_XMLFILEINH> tag in the resulting XML. It doesn't because the opening and closing delimiters < and > are escaped on GENERATE.

Upvotes: 4

cschneid
cschneid

Reputation: 10775

IBM's Enterprise COBOL does not currently have any options to deal with generating CDATA.

To solve your problem, you could leave xmlFileInh unpopulated, XML GENERATE into SOME-BUFFER then...

UNSTRING 
  SOME-BUFFER 
  DELIMITED '<xmlFileInh>' OR '</xmlFileInh>' 
  INTO 
    FIRST-PART  COUNT IN FIRST-PART-COUNT
      DELIMITER IN FIRST-DELIMITER 
    SECOND-PART
      DELIMITER IN SECOND-DELIMITER 
    THIRD-PART  COUNT IN THIRD-PART-COUNT
END-UNSTRING

...then...

STRING 
    FIRST-PART(1:FIRST-PART-COUNT)   DELIMITED SIZE
    FIRST-DELIMITER                  DELIMITED SPACE
    CDATA-CONTENT                    DELIMITED ']]>'
    ']]>'                            DELIMITED SIZE
    SECOND-DELIMITER                 DELIMITED SPACE
    THIRD-PART(1:THIRD-PART-COUNT)   DELIMITED SIZE
  INTO FINAL-DESTINATION
END-STRING

...which I've just freehanded, so no guarantees. It's also aesthetically displeasing and someone should submit an RFE to IBM to handle CDATA in XML GENERATE.

Upvotes: 1

Related Questions