Nils
Nils

Reputation: 1

Read textfile with characters in legacy Fortran

In a project wrapping legacy Fortran code in R a text-file is read by the subroutine "rfort". A working simplified version of the subroutine looks like follows:

  SUBROUTINE rfort()
  implicit none

  INTEGER I,IX,IY
  DIMENSION IX(10),IY(10)
  CHARACTER*6 NAME(10)

  OPEN(UNIT=8,FILE='TEST.DAT',STATUS='OLD')
  OPEN(UNIT=9,FILE='RESULT.DAT',STATUS='UNKNOWN')

  DO I=1,10
  READ(8,1020)IX(I),IY(I),NAME(I)
1020      FORMAT(8X,2I8,A6)
  WRITE(9,1030)IX(I),IY(I),NAME(I)      
1030      FORMAT(8X,2I8,A6)
  ENDDO
  CLOSE (8)
  CLOSE (9)
  END

The text-file ("TEST.DAT") consists of four variables: a row identifier (ignored), two integer variables ("IX", "IY") and one character variable ("NAME").

       1     395    1232 1084
       2     415    1242 1024
       3     433    1253 125
       4     409    1204 1256
       5     427    1217 105
       6     446    1226 1253
       7     489    1239 1254
       8     560    1255 1260a
       9     720    1270 1067
      10     726    1293 1078d

While the subroutine compiles fine (on MacOS 10.11.6, R 3.5.0) with

R CMD SHLIB rfort.f

and can also invoked in R with

dyn.load("rfort.so")

and runs without error with

.Fortran("rfort")

it strangely reads in only the integer columns as is tested by "RESULT.DAT". The character column is ignored, whatever I tried. The very same code works as expected as a stand-alone Fortran-programme (compiled with gfortran 6.1.0), so I suspect it has something to do with the formatting. However, I am at my wits' end, so any help is appreciated!

Upvotes: 0

Views: 398

Answers (2)

Matt P
Matt P

Reputation: 2367

In your example, it appears you want the output file to consist of the final 3 columns from TEST.DAT, but the output is not what you expect to see. You have two choices: 1) change the spacing in TEST.DAT to match the format statements, or 2) change the format statements to match the spacing in TEST.DAT.

Let's look at your format statements. The 1020 format says to skip the first 8 columns, read 2 integer types from the next 16 columns (8 columns for each int), and then a character type from the next 6 columns. For example, line 10 from TEST.DAT is read as follows:

TEST.DAT (line 10) with spacing illustrated:
       |       |       |     |
123456781234567812345678123456
  10     726    1293 1078d

As you can see, the value '726' is read into IX(10), but '12393107' is read into IY(10), and '8d' is read into NAME(10). Awesome, right, but not what you expected! Then when the output is printed numbers are right-aligned by default, while characters are left-aligned by default, so the last two columns in RESULTS.DAT are printed without a blank space between them:

RESULTS.DAT (line 10) with spacing illustrated:
       |       |       |     |
123456781234567812345678123456
             726 12931078d    

Here's my recommendation: change your read format so that it is much more forgiving and flexible. Simply replace the 1020 specifier to *, which means that each item on the line (comma or space-separated) forms a sequence that will be transferred into the corresponding variable in your I/O list. This is called the list-directed format specifier. Note that because the row number becomes part of the input list, you will need to define an integer integer dummy_val (at the top of the subroutine) which you can then ignore. Now read each line using:

read(8, *) dummy_val, IX(i), IY(i), NAME(i)

You can do the same thing for your write statment: write(9,*), IX(i), IY(i), NAME(i) which will use a reasonable default field width and guarantee that a blank space exists between each item in the I/O list. If you want more control over how the output is formatted, continue to use a format statement, but change it so that a certain number of spaces are guaranteed to be placed between each item:

write(9, "(4x,I8,I8,1x,A6)") IX(i), IY(i), NAME(i)

Upvotes: 0

Steve Lionel
Steve Lionel

Reputation: 7267

I think the 8X in your READ format should be 4X. Let's look at the first input line (I have added column numbers):

         1         2
1234567890123456789012345
   1     395    1232 1084

The format is 8X,2I8,A6. We skip columns 1-8 and start reading the first integer from columns 9-16, which is b395bbbb and the second from columns 17-24 1232bb10. As you can see, part of the character data is being read as the second integer. The default of BLANK='NULL' means that embedded blanks are ignored (I assume you are not using a FORTRAN 66 compiler!)

Why you say it seems to work with gfortran, I don't know. Nor do I know why the results should be different depending on how you invoke the subroutine.

Upvotes: 1

Related Questions