Reputation: 22254
"strasse".Equals("STRAße",StringComparison.InvariantCultureIgnoreCase)
This returns true. Which is correct. Unfortunately, when I store one of these in postgres, it thinks they are not the same when doing a case insensitive match (for example, with ~*
). I've also tested with citext.
So one solution would be to pre-fold the case, thus storing strasse
for either of these values, in another column. I could then index and search on that for matches.
I've been looking for how to fold case in C# for a while, and haven't been able to find a solution in C#. Obviously that knowledge is there because it can compare these strings properly, I just can't find where to get it from.
One solution would be to spawn a perl process perl -E "binmode STDOUT, ':utf8'; binmode STDIN, ':utf8'; while (<>) { print fc }"
, set the C# side of the process to utf8 for those pipes as well, and just send the text through perl to fold the case. But there has to be a better way than that.
Upvotes: 8
Views: 643
Reputation: 17350
Could be more than you need, but from Unicode Technical Report #36
The Unicode property [NFKC_Casefold] can be used to get a combined casefolding, normalization, and removal of default-ignorable code points.
This is implemented in ICU library wrapper for .NET. A call would look like this:
Icu.Normalization.Normalizer2.GetNFKCCasefoldInstance().Normalize(mystring)
A good overview: Truths programmers should know about case
Upvotes: 0
Reputation: 1710
Looking through the sources I eventually found that most of this implementation is in a set of classes called CompareInfo.
You can find these at github.com/dotnet/runtime
That led me to this page that clues in to the inner workings for the .net culture stuff. .NET globalization and ICU
It seems that dotnet is actually relying completely on native libraries for everything except ordinal operations.
I would assume by this that the .Net Framework is probably using NLS from Win32. For that there is the FoldStringW method that looks promising.
For ICU there is documentation for Case Mappings and I found the u_strFoldCase method.
Upvotes: 0
Reputation: 71169
There is string.Normalize()
, which takes a NormalizationForm
parameter. Michael Kaplan goes into detail on this. He claims it does a better job than FoldStringW
.
It does not, however, normalize the case to upper or lower, it only folds to the canonical form. I would suggest you just apply ToUpper
or ToLower
afterwards.
Upvotes: 0