user2008151314
user2008151314

Reputation: 680

estimate the size of a binary file

I am very confused about this. I dump out a 3D array of size 16000*4*2, all elements of which are DOUBLE PRECISION, and I thought I should have gotten a file of size: 16000*4*2*8 bytes/dp = 1,024,000 bytes. But I keep getting 2,048,000 bytes.

And I tested with a simple test program:

PROGRAM testprog
  IMPLICIT NONE
  DOUBLE PRECISION :: x=0.0D0
  INTEGER :: i

  OPEN(UNIT=128,FILE='try.out',FORM='UNFORMATTED',ACCESS='SEQUENTIAL')

  DO i=1,16000*4*2
    WRITE(128) x
  ENDDO

  CLOSE(128)

ENDPROGRAM testprog

And run it with these commands:

gfortran f.f90 -o a 
./a 
ls -als try.out

and what I get is

2000 -rw-r--r-- 1 jipuwang umstudents 2048000 Dec 17 20:16 try.out

I cannot make sense of it. One double precision uses 2 bytes right?

I did something else, if someone could also help me understand this:

PROGRAM testprog
  IMPLICIT NONE
  DOUBLE PRECISION :: x=0.0D0
  INTEGER :: i

  OPEN(UNIT=128,FILE='try.out')

  DO i=1,2
     WRITE(128,*) x
  ENDDO

  CLOSE(128)

ENDPROGRAM testprog

It gives me a file of size 54 bytes.

Upvotes: 1

Views: 861

Answers (2)

deepak
deepak

Reputation: 2075

You want to dump an array, so i am going to assume you have a 3d array. Then your program becomes:

PROGRAM testprog
  IMPLICIT NONE
  DOUBLE PRECISION :: x(16000,4,2)
  INTEGER :: i

  OPEN(UNIT=128,FILE='try.out',FORM='UNFORMATTED',ACCESS='SEQUENTIAL')
  WRITE(128) x
  CLOSE(128)

ENDPROGRAM testprog

Now, the size of the file will be 4 + 16000*4*2*8 + 4 bytes long. The first and the last 4 bytes mark the start and the end of the record.

Upvotes: 2

IanH
IanH

Reputation: 21431

An unformatted sequential file - which is what you are using, is a record oriented file format. For every record some book-keeping fields are typically written to the file to enable the processor to navigate from record to record. Details may vary from compiler to compiler, but there is a reasonable amount of consistency out there across compilers.

Because you invoke WRITE 16000*4*2 times, you are writing 16000*4*2 records. You will incur 16000*4*2 lots of per-record overhead. Eight bytes of overhead (four bytes of length leading and trailing the record data) per record is typical.

If that overhead is problematic, then consider writing more values per record. You could even put the entire data set in a single record.

A double precision variable typically occupies eight bytes, though it will vary by platform, compiler and compile options.

Upvotes: 8

Related Questions