nanda
nanda

Reputation: 63

Reading a line containing comma-separated floats from file using Fortran

I have a text file containing comma-separated numbers like so:

757.76019287, 759.72045898, 760.97259521, 763.45477295, 765.99475098, 770.2713623

It is not known how many of these numbers are present in this file; it varies but is limited to a few hundred numbers.

The objective is to:

  1. Open this file (say, customwav.txt) and find out how many numbers are present in this file --> put them into an integer n.

  2. Allocate memory for these numbers into an array --> I already have subroutines that do this for me.

  3. Read the line of numbers into this allocated array.

What is the best way to do 1 and 3 in Fortran?

Upvotes: 2

Views: 2638

Answers (3)

High Performance Mark
High Performance Mark

Reputation: 78334

OP tells us the file contains a few hundred real numbers, neatly separated by commas. Here's a simple approach to reading those into an array of the right size. This is indifferent to the number of lines in the file.

First, declare an allocatable array, and an integer for handling the end of file which will occur later

real, dimension(:), allocatable :: numbers
integer :: ios

... allocate the array, with some overhead. Yes, I'm just going to read however many numbers are in the file in one gulp, I'm not going to try to figure out how many there are.

allocate(numbers(1000))

... set every value to a guard value whose utility will be obvious later; this does assume that the file won't contain the chosen guard value

numbers  = -huge(1.0)

... read the numbers; I assume the file is already open on unit inunit

read(inunit,*,iostat=ios) numbers

... at this point ios has a non-zero value but for the simple case outlined in the question there's no need to do anything with it, we've been told that there are only a few hundred of them. And finally

numbers = pack(numbers, numbers>-huge(1.0))

to reallocate numbers to the right size.

Upvotes: 1

jme52
jme52

Reputation: 1123

Assuming your file only has one line, as it seems:

  1. If you know the format of the numbers, you can do non-advancing I/O; e.g., if all numbers are floats which use 12 spaces, of which 8 are decimal places, and they are separated by a comma followed by a space, without a final comma & space after the last number:

        integer, parameter :: DP = selected_real_kind(15,300)
        real(kind=DP) :: x
        real(kind=DP), allocatable :: xall(:)
        integer :: u
        integer :: n
        character(len=2) :: c
    
        open(newunit=u, file='customwav.txt',status='old',action='read')
    
        n = 0
        do
          read(u, '(f12.8)', advance='no') x
          n = n + 1
          read(u, '(2a)', advance='no', eor=100) c
          if (c .ne. ', ') STOP 'unknown format'
        end do
    
    100 write(*,*) n
        allocate(xall(n))
    
        rewind(u)
        read(u, *) xall
        close(u)
    
        write(*,*) xall
    
  2. If you don't know the format, or the format changes in non-regular ways, a lazy (and inefficient) method would be to try to read the whole array at one time. The following dumb code tries this by increasing the size one by one, but you could do bisection.

    integer, parameter :: DP = selected_real_kind(15,300)
    integer, parameter :: MaxN = 1000
    real(kind=DP), dimension(MaxN) :: x
    integer :: u
    integer :: n
    integer :: error
    
    open(newunit=u, file='customwav.txt',status='old',action='read')
    
    error = 0
    n = 0
    do while (error.eq.0)
      read(u, *, iostat=error) x(1:n+1)
      if (error .eq. 0) then
         n = n + 1
         rewind(u)
      end if
    end do
    
    write(*,*) n
    
    rewind(u)
    read(u, *) x(1:n)
    close(u)
    
    write(*,*) x(1:n)
    
  3. If non-Fortran tools are allowed, you can count the number of commas in the (single-line) file with the following shell command

    $ grep -o ',' customwav.txt | wc -l
    

    so the number of floats is that number possibly plus one (depending on the format). For multi-line files, you can obtain the list of counts of commas per line with

    $ f=customwav.txt
    $ for lin in $(seq $(cat $f | wc -l))
    > do
    >   sed -n ${lin}'p' $f | grep -o ',' | wc -l
    > done
    

Upvotes: 2

johncampbell
johncampbell

Reputation: 203

My approach for an unknown file is to first open it using stream I/O, read 1 character at a time and count the occurrence of all characters in the file : count_characters(0:255).

This can tell you a lot about what to expect, such as:

LF indicates number of lines in file
CR indicates DOS file rather than unix file format
.  can indicate real numbers
,  can indicate csv file format
; / : can indicate other delimiters
presence of non-numeric characters indicates non-numeric information
E or e can indicate scientific format
/ or : can indicate date/time info
The count_<lf> + count_, is an estimate of numbers in the file.

The advantage of this approach is that it identifies possible unusual data to be recovered. It is probably best as a stand alone utility, as the interpretation can be difficult to code.

Upvotes: 1

Related Questions