Reputation: 349
in COBOL I am reading from sequential line file. Line by line, to EOF, something like that
read bank-file at end
move 'Y' to end-of-bank
And lines have variable length from 40 to 80 characters. And I need to know, how many characters are on each line. But line can end with some spaces, which I need count too. So I can't take length of string from variable in program. Is there any return value from READ statement, which returns number of characters from readed line (until, CRLF is reached)?
Upvotes: 1
Views: 3746
Reputation: 1
Just in case you still don't know how many bytes you have, try this:
Wonderful thing about cobol on unix/linux/pcs is for the most part they do not check the file structure they assume you were bright enough to tell the program what the file was, and in the case of a complicated file such as a an MFCobol B-Tree index embedded in the file, the file header will do the rest.
My first exposure to MFCobol had users ending up with corrupt files all the time and we needed a way to know what was wrong quickly, so I leveraged this fact and basically parsed the files looking for certain features, such as a x'0A' (UNIX) or a CR/LF which would tell us that someone FTP'd a file from PC to LINUX using binary transfer. It did exactly as we had hoped and we eventually released it as an end user utillity.
Based on this, you COULD just tell the file it has 1 byte records and read each byte as a binary sequential. This would let you count the bytes as they go by. Change the file definition to BINARY SEQUENTIAL with record size of pic x(01). Since you state that the record terminator is CR/LF you will need a 2 byte field for pattern recognition, and to reduce the byte count for the delimiters.
SELECT SOME-FILE
ASSIGN TO "someFile.txt"
ORGANIZATION IS BINARY SEQUENTIAL.
DATA DIVISION.
FILE SECTION.
FD SOME-FILE
01 SOME-BYTE PIC X(01).
WORKING-STORAGE SECTION.
01 PATTERN-BUFFER.
05 PB-01 PIC X(01).
05 PB-02 PIC X(01).
01 BYTE-COUNT PIC 9(9) VALUE ZERO.
01 END-OF-SOME-FILE PIC X(01) VALUE IS "N"
PROCEDURE DIVISION.
MAIN.
open SOME-FILE.
READ SOME-FILE INTO SOME-BYTE
AT END
CLOSE SOME-FILE
DISPLAY "BYTE-COUNT: 0"
STOP RUN
NOT AT END
MOVE 1 TO BYTE-COUNT
PERFORM UNTIL END-OF-SOME-FILE="Y"
READ SOME-FILE ** (1 byte record)
AT END MOVE "Y" TO END-OF-SOME-FILE
DISPLAY BYTE-COUNT
STOP RUN
NOT AT END
ADD 1 to BYTE-COUNT
MOVE PB-02 to PB-01
MOVE SOME-BYTE TO PB-02
IF PATTERN-BUFFER = x'0D0A'
SUBTRACT 2 FROM BYTE-COUNT
ELSE
IF PB-01 = x'00" AND PB-02 < X'20' <<=== SEE NOTE
SUBTRACT 1 FROM BYTE=COUNT
END-IF
END-IF
END-READ
END-PERFORM
END-READ
MF COBOL can optionally do two things to LINE SEQUENTIAL files that can mess with your count.
The first is to remove all trailing blanks... but according to the spec this should be fine you want the number of actual stored bytes.
The second is marking off characters that may in certain conditions be misinterpreted. This is especially true of carriage control characters that may look like a binary integer value. If MF Cobol sees a value less than the ascii value of a space, it will place a binary 0 value in a flag byte before it.. This flag byte while taking space in the file is not data, it is a file structure marker and would not normally find itself in your output count, but because we made the file binary sequential, it id not being removed from the read at runtime, and as such if you see a LOW-VALUE or x'00' followed by a character of a value less than x'20"then reduce your output byte count by 1.
Upvotes: 0
Reputation: 22977
As mentioned in the comments, it actually is possible to get the number of characters (bytes) read, indeed with the RECORD VARYING DEPENDING ON
clause:
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT SOME-FILE
ASSIGN TO "someFile.txt"
ORGANIZATION IS LINE SEQUENTIAL.
DATA DIVISION.
FILE SECTION.
FD SOME-FILE
RECORD VARYING 40 TO 80 DEPENDING ON SOME-LINE-LENGTH.
01 SOME-LINE PIC X(80).
WORKING-STORAGE SECTION.
77 SOME-LINE-LENGTH PIC 9(3).
Now for each read, the record length is stored into SOME-LINE-LENGTH
:
READ SOME-FILE NEXT RECORD
DISPLAY SOME-LINE-LENGTH
I don't know exactly which vendors support it (possibly almost all), but at least it works with ACUCOBOL.
As far as I know, there is no feedback on the number of bytes read by the execution of the READ
statement. Apparently, bytes are instantly stored into a record described by a file descriptor in your FILE SECTION
.
However, you can calculate the number of bytes read by counting the number of characters written to the record.
First, initialize the file record to LOW-VALUES
. Then read the next record; that will move the number of bytes read to the record. When the number of bytes read is smaller than the record size, the bytes at the end of the record are left unchanged.
MOVE LOW-VALUES TO YOUR-RECORD
READ YOUR-FILE NEXT RECORD
PERFORM VARYING SOME-COUNTER FROM 72 BY -1 UNTIL (SOME-COUNTER < 0)
IF NOT (YOUR-RECORD(SOME-COUNTER : 1) = LOW-VALUES)
EXIT PERFORM
END-IF
END-PERFORM
SOME-COUNTER
will contain the line length, assuming no NUL
values are present in the file.
I guess this will be time-consuming when the number of lines is large, but at least you got your line lengths.
As Bill Woodger already mentioned, since you didn't provide additional details, I had to make some assumptions.
I'm running MicroFocus ACUCOBOL-GT on Windows 10 myself.
Upvotes: 1