pat
pat

Reputation: 57

Convert EBCDIC data file to ASCII

I have an EBCDIC data file which is variable length.
Inside this file, it contains binary data (comp), packed-decimal (comp-3), display-numeric (pic (9)), and string (pic (x)).
How to convert it to ASCII using a language such as Java, Perl, or COBOL?

Upvotes: 1

Views: 3001

Answers (3)

Bruce Martin
Bruce Martin

Reputation: 10553

Yes it is possible (For java look at JRecord project). Jrecord can use a Cobol copybook to read a file in Java. It can also use a Xml description as well.

Warnings

Before doing anything else

  1. Make sure the file is still Ebcdic !!!. If the file has been through a EBCDIC to ASCII Conversion, the comp/comp-3 fields will be corrupted.
  2. You describe the file as being variable length. Does that mean it has variable length records ???. If so make sure the RDW option was used to transfer the file !!

If you do not get the file transfer* done correctly you will end up wasting a lot of time and then redoing what you have already done.

Cobol Copybook

I am assuming you can get a Cobol copybook. If so

  1. Try editing the file using the RecordEditor. There is an outdated answer here. I will try and provide a more up to date answer.
  2. You can generate Skelton Java/JRecord in the RecordEditor. See How do you generate java~jrecord code for a Cobol copybook

JRecord Project

The JRecord project will read/write Cobol data files using

  • A Cobol Copybook
  • A Xml File Description
  • File Schema defined in Java

There are 3 sub-project that can be used to convert simple Cobol files to CSV, Xml or Json files. More complicated files need Java/JRecord

JRecord CodeGen

JRecord CodeGen will generate sample Java/JRecord code to Read/Write Cobol files (including Mainframe Cobol) from a Cobol Copybook. JRecord CodeGen is used by the RecordEditor to generate Java/JRecord programs.

Upvotes: 1

Gilbert Le Blanc
Gilbert Le Blanc

Reputation: 51565

First of all, you need to verify that the file is still EBCDIC. Some file transfer programs automatically convert the EBCDIC to ASCII, which corrupts the COMP and COMP-3 fields. You can verify this visually by looking for space characters in the alphanumeric fields. EBCDIC space is x'40'. ASCII space is x'20'.

Assuming that the file is EBCDIC, you have to write three separate conversions. COMP to ASCII, COMP-3 to ASCII, and alphanumeric to alphanumeric. Each field in the record has to be extracted and converted separately.

The alphanumeric to alphanumeric conversion is basically a lookup table for the alphabet and digits. You look up the EBCDIC character and replace it with an ASCII character.

COMP is a binary format. It's usually 4 bytes or 8 bytes,

COMP-3 is a packed decimal format with a sign byte. The sign byte is usually following the number, but not always. For example, 124 looks like x'124F'. The last byte is a sign field, x'C' is unsigned (positive), x'D' is negative, and x'F' is positive.

Upvotes: 2

phunsoft
phunsoft

Reputation: 2745

You need to treat each field by itself. You must not convert the COMP fields. You convert the character fields with the method provided by the language of choice.

Upvotes: 1

Related Questions