Michael Cantrall
Michael Cantrall

Reputation: 323

R grepl not giving desired result loading CSV

I dont know what I could be overlooking here but I am importing a csv file with a bunch of names into a data.frame. When I pull the data frame value and run grepl against it there is no match. If I take that same value and manually create a string it matches fine. Any help would be appreciated.

I obviously cant give you the CSV or the data source so I have tried to include all the code below.

After further look, it seems the string no longer has a space

> Parks[1,2]
[1] "Abraham Lincoln Birthplace National Historical Park"
> typeof(Parks[1,2])
[1] "character"
> grepl(" ", Parks[1,2], fixed = TRUE)
[1] FALSE
> grepl("National Historical Park", Parks[1,2])
[1] FALSE
> grepl("National", Parks[1,2], fixed = TRUE)
[1] TRUE
> grepl("National Historical Park", "Abraham Lincoln Birthplace National Historical Park")
[1] TRUE

> grepl(" ", "Abraham Lincoln Birthplace National Historical Park")
[1] TRUE

Upvotes: 0

Views: 131

Answers (1)

Michael Cantrall
Michael Cantrall

Reputation: 323

The blank spaces were unicode \u2022 characters. Running the following code before grepl results in the desired result.

> Code <- Parks[1,2]
> Code <- gsub('[^\x20-\x7E]', ' ', Code)
> grepl(" ", Parks[1,2], fixed = TRUE)
[1] TRUE
> grepl("National Historical Park", Parks[1,2])
[1] TRUE

Upvotes: 0

Related Questions