tzema
tzema

Reputation: 461

problem with regex not recognizing some special chars (german letters) in R

I am trying to detect strings containing special characters like ä, ü, ö and ß

I have a list of allowed characters, and I am using it like this, to detect any string that contains anything else but these:

 grepl("[^0-9a-zA-Z$%^*&]","aaüh")

However, this returns FALSE. So it fails to detect the special ü.

How can I make explicit that only latin characters are allowed?

Upvotes: 2

Views: 97

Answers (1)

Kat
Kat

Reputation: 18734

You have to convert the string first. I used the base R function iconv to encode the string. The iconv function will create "aa<U+00FC>h" in this example.

gimme <- function(val) {iconv(val, from = "UTF-8", "ASCII", "Unicode")}

grepl("[^0-9a-zA-Z$%^*&]", gimme("aaüh"))
# [1] TRUE 

Upvotes: 2

Related Questions