Fernando Brito Lopes
Fernando Brito Lopes

Reputation: 113

How to extract substring of Fortran string array using index of position?

I have two files, one with two columns where I want to extract a substring from the second column; and the other file has a single column with the position used to subset the string. The first and second files look like these:

File 1: file.txt

 1 123456789
 2 123456789
 3 123456789
 4 123456789
 5 123456789
 6 123456789
 7 123456789
 8 123456789
 9 123456789
10 123456789

File 2: index.txt

1
3
5
7

In this example, the second column of the file.txt has 9 values without space. I would like to do a subset based on the position from the index.txt file.

I wrote the following program in Fortran that where I can subset them, but I don't know how to collapse them together so when I write them to a file they would be together without space.

Fortran file: subsetFile.f90

program subsetfile
  implicit none
  integer :: io,tmp,n,m,s,i,ind
  integer, dimension (:), allocatable :: vec, idx
  character(len=1000) :: arr
  character(len=1000) :: fn, fnpos
  print*, "File name:"
  read*, fn
  print*, "Position file name:"
  read*, fnpos
  open(unit=100, file=fnpos, status='old', action='read')
  n = 0
  do
    read(100,*,iostat=io)
    if (io/=0) exit
    n = n + 1
  end do
  close(unit=100)
  allocate (idx(n))
  open(unit=101, file=fnpos, status='old', action='read')
  do i=1,n
    read(101,*) idx(i)
  end do
  close(unit=101)
  s = n + 1
  open(unit=102, file=fn, status='old', action='read')
  n = 0
  do
    read(102,*,iostat=io)
    if (io/=0) exit
    n = n + 1
  end do
  close(unit=102)
  open(unit=103, file=fn, status='old', action='read')
  do
    read(103,*) tmp, arr
    m = len_trim(arr)
    exit
  end do
  close(unit=103)
  allocate (vec(m))
  open(unit=104, file = fn, status = 'old', action = 'read')
  open(unit=105, file = 'output.txt', status = 'replace')  
  do i=1,n
    read(104,*) ind, arr
    read(arr,'(*(i1))') vec
    write(105, *) ind, vec(idx)
  end do
  close(unit=104)
  close(unit=105)
  deallocate (idx, vec)
end program subsetfile

The following is the output I get when I run the code:

           1           1           3           5           7
           2           1           3           5           7
           3           1           3           5           7
           4           1           3           5           7
           5           1           3           5           7
           6           1           3           5           7
           7           1           3           5           7
           8           1           3           5           7
           9           1           3           5           7
          10           1           3           5           7

The following is the desired output:

 1 1357
 2 1357
 3 1357
 4 1357
 5 1357
 6 1357
 7 1357
 8 1357
 9 1357
10 1357

Does anyone know how can I write a file in that format, with only two columns?

Thank you

Upvotes: 0

Views: 830

Answers (2)

Federico Perini
Federico Perini

Reputation: 1416

In Fortran, each format for formatted I/O is a string: so you have complete freedom as far as how you can specify it.

In most cases, your format never changes, see it as a PARAMETER in your program. In fact, you can specify such parameter string in three ways:

  1. Inside the read/write statement:
write(unit,'(xxxxx)')
  1. as a format label
write(unit,100)
100 format(xxxxx)
  1. as a parameter string
character(len=*), parameter :: myFmt = "(xxxxx)"
write(unit,myFmt)
  1. as a non-parameter string. Note in both ways 1) and 2) you are just using a character(len=*), parameter string variable. Similarly, if your format may vary at runtime, just create an appropriate format string every time you need to use it, for example:
program test_formatString
        implicit none

        ! Copy user data
        integer, parameter :: vec(*) = [1,2,3,4,5,6,7,8,9,0]
        integer, parameter :: idx(*) = [1,3,5,7]

        integer :: n

        n = 12

        ! Test variable sizes of the first column vs the index columns
        write(*,myWidthFmt(2,1)) n, vec(idx)
        write(*,myWidthFmt(4,2)) n, vec(idx)
        write(*,myWidthFmt(3,3)) n, vec(idx)

        contains

        ! Function to create a format string
        character(len=15) function myWidthFmt(indWidth,vecWidth) result(fmt)
           integer, intent(in) :: indWidth,vecWidth
           write(fmt,1) min(indWidth,99),min(vecWidth,99)
           1 format('(i',i2,',1x,*(i',i2,'))')
        end function myWidthFmt

end program test_formatString

Upvotes: 1

You should use explicit format for the output, not the list-directed format (*). You are already using the i1 descriptor for the read. You can also use it for the write.

write(105, '(i0,5x,*(i1))') ind, vec(idx)

If those vec members may be larger than 9 and occupy more digits, use i0 instead. Adjust other parameters as needed (e.g. fixed number of characters for the first number or the number of the spaces between the columns.

write(105, '(i10,1x,*(i1))') ind, vec(idx)

Upvotes: 2

Related Questions