Reputation: 57
I have an EBCDIC data file which is variable length.
Inside this file, it contains binary data (comp), packed-decimal (comp-3), display-numeric (pic (9)), and string (pic (x)).
How to convert it to ASCII using a language such as Java, Perl, or COBOL?
Upvotes: 1
Views: 3001
Reputation: 10553
Yes it is possible (For java look at JRecord project). Jrecord can use a Cobol copybook to read a file in Java. It can also use a Xml description as well.
Before doing anything else
If you do not get the file transfer* done correctly you will end up wasting a lot of time and then redoing what you have already done.
I am assuming you can get a Cobol copybook. If so
The JRecord project will read/write Cobol data files using
There are 3 sub-project that can be used to convert simple Cobol files to CSV, Xml or Json files. More complicated files need Java/JRecord
JRecord CodeGen will generate sample Java/JRecord code to Read/Write Cobol files (including Mainframe Cobol) from a Cobol Copybook. JRecord CodeGen is used by the RecordEditor to generate Java/JRecord programs.
Upvotes: 1
Reputation: 51565
First of all, you need to verify that the file is still EBCDIC. Some file transfer programs automatically convert the EBCDIC to ASCII, which corrupts the COMP and COMP-3 fields. You can verify this visually by looking for space characters in the alphanumeric fields. EBCDIC space is x'40'. ASCII space is x'20'.
Assuming that the file is EBCDIC, you have to write three separate conversions. COMP to ASCII, COMP-3 to ASCII, and alphanumeric to alphanumeric. Each field in the record has to be extracted and converted separately.
The alphanumeric to alphanumeric conversion is basically a lookup table for the alphabet and digits. You look up the EBCDIC character and replace it with an ASCII character.
COMP is a binary format. It's usually 4 bytes or 8 bytes,
COMP-3 is a packed decimal format with a sign byte. The sign byte is usually following the number, but not always. For example, 124 looks like x'124F'. The last byte is a sign field, x'C' is unsigned (positive), x'D' is negative, and x'F' is positive.
Upvotes: 2
Reputation: 2745
You need to treat each field by itself. You must not convert the COMP fields. You convert the character fields with the method provided by the language of choice.
Upvotes: 1