estimate the size of a binary file

Question

I am very confused about this. I dump out a 3D array of size 16000*4*2, all elements of which are DOUBLE PRECISION, and I thought I should have gotten a file of size: 16000*4*2*8 bytes/dp = 1,024,000 bytes. But I keep getting 2,048,000 bytes.

And I tested with a simple test program:

PROGRAM testprog
  IMPLICIT NONE
  DOUBLE PRECISION :: x=0.0D0
  INTEGER :: i

  OPEN(UNIT=128,FILE='try.out',FORM='UNFORMATTED',ACCESS='SEQUENTIAL')

  DO i=1,16000*4*2
    WRITE(128) x
  ENDDO

  CLOSE(128)

ENDPROGRAM testprog

And run it with these commands:

gfortran f.f90 -o a 
./a 
ls -als try.out

and what I get is

2000 -rw-r--r-- 1 jipuwang umstudents 2048000 Dec 17 20:16 try.out

I cannot make sense of it. One double precision uses 2 bytes right?

I did something else, if someone could also help me understand this:

PROGRAM testprog
  IMPLICIT NONE
  DOUBLE PRECISION :: x=0.0D0
  INTEGER :: i

  OPEN(UNIT=128,FILE='try.out')

  DO i=1,2
     WRITE(128,*) x
  ENDDO

  CLOSE(128)

ENDPROGRAM testprog

It gives me a file of size 54 bytes.

IanH · Accepted Answer

An unformatted sequential file - which is what you are using, is a record oriented file format. For every record some book-keeping fields are typically written to the file to enable the processor to navigate from record to record. Details may vary from compiler to compiler, but there is a reasonable amount of consistency out there across compilers.

Because you invoke WRITE 16000*4*2 times, you are writing 16000*4*2 records. You will incur 16000*4*2 lots of per-record overhead. Eight bytes of overhead (four bytes of length leading and trailing the record data) per record is typical.

If that overhead is problematic, then consider writing more values per record. You could even put the entire data set in a single record.

A double precision variable typically occupies eight bytes, though it will vary by platform, compiler and compile options.

estimate the size of a binary file

Answers (2)

Related Questions