Briefbreaddd
Briefbreaddd

Reputation: 379

How to find french UTF-8 accent character in R with the simple character using str_detect?

I need to find accented characters with the simple character. For example: "é","è" or "ê" with e in French Canadian, UTF-8.

 library(tidyverse)

 Sys.setlocale(locale = "fr_CA.UTF-8")
 a <- c("Léger", "leger")

 str_detect(a, regex("leger", ignore_case=T))
 ## [1]  FALSE  TRUE

 str_detect(a, coll("leger", ignore_case=T, locale = "fra"))
 ## [1] FALSE  TRUE

The results of this code should be TRUE, TRUE.

Upvotes: 2

Views: 623

Answers (1)

MrFlick
MrFlick

Reputation: 206401

You can convert the input string to just use ASCII characters then do the match on that. For example

str_detect(iconv(a, to='ASCII//TRANSLIT'),regex("leger", ignore_case=T))
# [1] TRUE TRUE

Upvotes: 2

Related Questions