Reputation: 5835
Though Fortran is case insensitive, I observe the same is not the case when strings are compared using if
check. Of course, the user should be aware of this.
However, Fortran being case insensitive, subconsciously I expect the same to apply for string checks as well. What's the efficient way to achieve case insensitive string comparison?
Upvotes: 2
Views: 2028
Reputation: 60058
This is a problem that is commonly solved in many programming languages, see, e.g., Case insensitive string comparison How to do case insensitive string comparison? Case-insensitive string comparison in C++ How can I do a case insensitive string comparison?
The usual way is the same, convert your strings all to lowercase or all to uppercase and compare these converted strings. Some languages have library functions, in some way you have to do it yourselves. If you are staying with ASCII characters, you can just subtract or add 32 to small case or to upper case characters respectively. I you are using some extended character sets like Unicode, you will need some library that can do the conversion for you (e.g., for converting from Ď to ď, or from Щ to щ,these really should be left to a specific library and they will appear in real world data string processing).
As to the subconscious that strings might be case insensitive in Fortran - there is no support for the standard for that and it would also be very impractical and difficult to do in general (e.g., for things like UTF8). It is much easier to add the in-sensitivity as an additional feature to your program rather than the other way,
Upvotes: 2
Reputation: 78344
Here's another approach which might appeal, let the compiler figure out the character codes and do all the arithmetic:
MODULE strings
IMPLICIT NONE
PRIVATE
PUBLIC :: to_upper
CHARACTER(len=26), PARAMETER :: uca = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
CHARACTER(len=26), PARAMETER :: lca = 'abcdefghijklmnopqrstuvwxyz'
CONTAINS
PURE CHARACTER FUNCTION to_upper_ch(ch)
CHARACTER, INTENT(in) :: ch
INTEGER :: ix
to_upper_ch = ch
ix = INDEX(lca,ch)
IF (ix /= 0) to_upper_ch = uca(ix:ix)
END FUNCTION to_upper_ch
PURE FUNCTION to_upper(str) RESULT(rslt)
CHARACTER(len=*), INTENT(in) :: str
CHARACTER(len=len(str)) :: rslt
INTEGER :: ix
DO ix = 1, LEN(str)
rslt(ix:ix) = to_upper_ch(str(ix:ix))
END DO
END FUNCTION to_upper
END MODULE strings
... and if you're thinking this is a bit laborious, two functions to convert a string to upper case, you're probably right, but this is a selection from a strings module I wrote where it makes sense to structure the code this way. As to this approach's efficiency compared with @King's code, test and measure if it matters to you.
Upvotes: 4
Reputation: 1835
As @Vladimir says in his answer, string case-sensitvity does matter in Fortran for string comparison. Moreover, trailing blanks are ignored in string comparison in Fortran, that is, "SKPS"=="SKPS "
evaluates to .true.
(the same does NOT hold for leading blanks). Here is an efficient implementation of what you need:
module String_mod
implicit none
public
contains
pure function getLowerCase(string) result(output)
! convert string to lower-case
character(*), intent(in) :: string
integer, parameter :: DUC = ichar('A') - ichar('a')
character(len(string)) :: output
character :: ch
integer :: i
do i = 1,len(string)
ch = string(i:i)
if (ch>='A' .and. ch<='Z') ch = char(ichar(ch)-DUC)
output(i:i) = ch
end do
end function getLowerCase
pure function getUpperCase(string) result(output)
! convert string to upper-case
character(*), intent(in) :: string
integer, parameter :: DUC = ichar('A') - ichar('a')
character(len(string)) :: output
character :: ch
integer :: i
do i = 1,len(string)
ch = string(i:i)
if (ch>='a' .and. ch<='z') ch = char(ichar(ch)+DUC)
output(i:i) = ch
end do
end function getUpperCase
end module String_mod
program stringComparison
use, intrinsic :: iso_fortran_env, only: output_unit
use String_mod, only: getLowerCase
implicit none
character(:), allocatable :: thisString, thatString
thisString = "STACKOVERFLOW"
thatString = "StackOverflow"
if (getLowerCase(thisString)==getLowerCase(thatString)) then
write(output_unit,"(*(g0,:,' '))") "Ignoring case-sensitivity, the strings are the same:" &
, '"'//thisString//'"', "==", '"'//thatString//'"'
else
write(output_unit,"(*(g0,:,' '))") "strings are NOT the same:", '"'//thisString//'"', "/=", '"'//thatString//'"'
end if
! Be mindful of trailing white-space characters. They do not matter in string comparison.
thisString = "STACKOVERFLOW"
thatString = "StackOverflow "
if (getLowerCase(thisString)==getLowerCase(thatString)) then
write(output_unit,"(*(g0,:,' '))") "Ignoring case-sensitivity, the strings are the same:" &
, '"'//thisString//'"', "==", '"'//thatString//'"'
else
write(output_unit,"(*(g0,:,' '))") "strings are NOT the same:", '"'//thisString//'"', "/=", '"'//thatString//'"'
end if
! Be mindful of leading white-space characters. They are important in string comparison.
thisString = "STACKOVERFLOW"
thatString = " StackOverflow"
if (getLowerCase(thisString)==getLowerCase(thatString)) then
write(output_unit,"(*(g0,:,' '))") "Ignoring case-sensitivity, the strings are the same:" &
, '"'//thisString//'"', "==", '"'//thatString//'"'
else
write(output_unit,"(*(g0,:,' '))") "strings are NOT the same:", '"'//thisString//'"', "/=", '"'//thatString//'"'
end if
end program stringComparison
Compiling and running the above code gives:
Ignoring case-sensitivity, the strings are the same: "STACKOVERFLOW" == "StackOverflow"
Ignoring case-sensitivity, the strings are the same: "STACKOVERFLOW" == "StackOverflow "
strings are NOT the same: "STACKOVERFLOW" /= " StackOverflow"
If this is a performance-critical section of the code, you could likely achieve even better performance to by inlining the getLowerCase()
function and modifying it to directly compare the ASCII character codes of the letters. But is it really worth it? It depends on the specifics of your problem.
You can test the above code here: https://www.tutorialspoint.com/compile_fortran_online.php
Upvotes: 2