Ahmed Sharaf
Ahmed Sharaf

Reputation: 29

Cobol data files

First let me apologize if data is not that complete . This is not me being lazy but me being not aware of cobol details .

I have been assigned in my firm to extract our old financial data from files read by cobol programs and turn them to a database in our oracle DB . I am not able to read these files as normal texts . i don't know how to turn then to normal text .

As per the cobol source each row is 7 records and each record is 72 chars .

the files are very large . each one is 3 GB in average . how can i open them as a normal text ?

here is the file section

000220 ENVIRONMENT DIVISION.
000230 CONFIGURATION SECTION.
000240 SOURCE-COMPUTER. NCR-3000.
000250 OBJECT-COMPUTER. NCR-3000.
000260 INPUT-OUTPUT SECTION.
000270 FILE-CONTROL.
000280     SELECT DQ-HIMVT-A      ASSIGN TO DISC
000290                            ORGANIZATION INDEXED
000300                            ACCESS MODE DYNAMIC
000310                            RECORD KEY CLE-A.
000320*
000330 DATA DIVISION.
000340 FILE SECTION.
000350 FD  DQ-HIMVT-A             BLOCK CONTAINS 7 RECORDS   
000360                            RECORD CONTAINS 73 CHARACTERS   
000370                            LABEL RECORD STANDARD
000380                            DATA RECORD IS HIMVT-A.   
000390 01  HIMVT-A. 
000400     02  CLE-A.
000410         03  ENT-A       PIC 99.
000420         03  NUCPT-A     PIC 9(13)     COMP-6.
000430         03  DEV-A       PIC XXX.
000440         03  DATOP-A     PIC 9(7)      COMP-6. 
000450         03  SIG-A       PIC 9.  
000460         03  FORC-A      PIC 9.
000470         03  DATVAL-A    PIC 9(7)      COMP-6.
000480         03  NUMOP-A     PIC 9(9)      COMP-6.  
000490         03  MT-A        PIC 9(12)V999 COMP-6. 
000500     02  FILLER          PIC X(8).
000510     02  TYPCPT-A        PIC 9(3)      COMP-6.
000520     02  LIBOP-A         PIC X(15).
000530     02  SOLD-A          PIC S9(12)V999 COMP-3. 
000540     02  DATTRAIT-A      PIC 9(7) COMP-6.
000550     02  FILLER          PIC X.

Here is a sample of the file when opened from notepad++ RMKF I I 0 ** ƒ ’ *B9 *B9 ’ ’ ÿ # "c *B9 Þ #01 EGP %10 % ƒ 21 $ '10 ' (@P )€ 010 0 0 EGP $21 $ %11 $ (EGP $21 $ %11 $ 7EGP $21 $ %11 $ FEGP $21 $ %11 $ UEGP $21 $ %11 $ ` ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ >01 ÔEGP %10 % ÔƒÖ 21Â NO. 0 ÄÕ

environment section

000220 ENVIRONMENT DIVISION.
000230 CONFIGURATION SECTION.
000240 SOURCE-COMPUTER. NCR-3000.
000250 OBJECT-COMPUTER. NCR-3000.
000260 INPUT-OUTPUT SECTION.
000270 FILE-CONTROL.
000280     SELECT DQ-HIMVT-A      ASSIGN TO DISC
000290                            ORGANIZATION INDEXED
000300                            ACCESS MODE DYNAMIC
000310                            RECORD KEY CLE-A.

I found this file which they call a copy book . don't know how it ois related

000100*
000200****     CINVDAT - ZONE DE TRAVAIL     ****
000300*******************************************
000400****
000500*
000600 01  INVDATRAV.
000700     03  INVZON1         PIC 99. 
000800     03  INVZON2         PIC 99.
000900     03  INVZON3         PIC 99.
001000 01  INVZONI             PIC 99.
001100 01  INVDATE             PIC 9(6).
001200 01  INVCAL              PIC 9.
001300*

Regards

Upvotes: 1

Views: 2700

Answers (2)

YJ_Chan
YJ_Chan

Reputation: 23

I'm not sure which system you are using. As my experience in AS400. COBOL data file using EBCDIC format, it cannot be open directly from a text editor. It will only show random texts. You have to convert it in to ASCII before you export. In AS400, I use CHGTOPCD file/member name to a directory and export it out. Then it will show correct texts. Not sure is this information helps you.

Upvotes: 2

Bill Woodger
Bill Woodger

Reputation: 13076

You may be able to locate a service which can do the extract for you. If you go this route, ensure that they have all the information you can provide (which must include the data-definitions under the FD) and agree to only pay on verified receipt of the data.

An alternative is to talk to Micro Focus about a short-term license for a COBOL which (again must be guaranteed) can understand the indexed-file format. You then write one simple program per file whose data you need to extract. Advantage here is that what COMP-3 and COMP-6 represent, you don't need to know, as the conversion to a "text" number is done without anyone having to think about it (on the output definition, you remove all references to COMP-anything (also COMP, if there happen to be any)).

A further alternative is to sit down with a hex editor, knowledge of the data, and work out how to abstract the index information away from the data (all the data records are a known, fixed, length, 73 bytes in your example).

Then, with your preferred language which can handle non-delimited-record (fixed length) binary data, and working out what COMP-3, COMP-6, and any other COMP- (or COMP) fields mean. They will likely be packed-decimal, Binary Coded Decimal (BCD) or "some type of binary" given that Standard COBOL has binary fields limited by decimal values (to the size of the PICture clause).

In the first and second alternatives, there is a greater expectation of the reliability of extract. The third may be the "cheapest", but expectations of the time expended to complete are more difficult to stick to.

Of the first two, cost is the likely determinant (assuming you are not going to use COBOL going forward). If you yourself have to write some COBOL programs, don't worry about that, they are very, very simple, and once you have done one, you simply "clone" it.

Upvotes: 6

Related Questions