digbyterrell
digbyterrell

Reputation: 3619

Case-insensitive string comparison in Julia

I'm sure this has a simple answer, but how does one compare two string and ignore case in Julia? I've hacked together a rather inelegant solution:

function case_insensitive_match{S<:AbstractString}(a::S,b::S)
    lowercase(a) == lowercase(b)
end

There must be a better way!

Upvotes: 6

Views: 3154

Answers (1)

Michael Ohlrogge
Michael Ohlrogge

Reputation: 10990

Efficiency Issues

The method that you have selected will indeed work well in most settings. If you are looking for something more efficient, you're not apt to find it. The reason is that capital vs. lowercase letters are stored with different bit encoding. Thus it isn't as if there is just some capitalization field of a character object that you can ignore when comparing characters in strings. Fortunately, the difference in bits between capital vs. lowercase is very small, and thus the conversions are simple and efficient. See this SO post for background on this:

How do uppercase and lowercase letters differ by only one bit?

Accuracy Issues

In most settings, the method that you have will work accurately. But, if you encounter characters such as capital vs. lowercase Greek letters, it could fail. For that, you would be better of with the normalize function (see docs for details) with the casefold option:

normalize("ad", casefold=true)

See this SO post in the context of Python which addresses the pertinent issues here and thus need not be repeated:

How do I do a case-insensitive string comparison?

Since it's talking about the underlying issues with utf encoding, it is applicable to Julia as well as Python.

See also this Julia Github discussion for additional background and specific examples of places where lowercase() can fail:

https://github.com/JuliaLang/julia/issues/7848

Upvotes: 7

Related Questions