Reputation: 323
I dont know what I could be overlooking here but I am importing a csv file with a bunch of names into a data.frame. When I pull the data frame value and run grepl against it there is no match. If I take that same value and manually create a string it matches fine. Any help would be appreciated.
I obviously cant give you the CSV or the data source so I have tried to include all the code below.
After further look, it seems the string no longer has a space
> Parks[1,2]
[1] "Abraham Lincoln Birthplace National Historical Park"
> typeof(Parks[1,2])
[1] "character"
> grepl(" ", Parks[1,2], fixed = TRUE)
[1] FALSE
> grepl("National Historical Park", Parks[1,2])
[1] FALSE
> grepl("National", Parks[1,2], fixed = TRUE)
[1] TRUE
> grepl("National Historical Park", "Abraham Lincoln Birthplace National Historical Park")
[1] TRUE
> grepl(" ", "Abraham Lincoln Birthplace National Historical Park")
[1] TRUE
Upvotes: 0
Views: 131
Reputation: 323
The blank spaces were unicode \u2022 characters. Running the following code before grepl results in the desired result.
> Code <- Parks[1,2]
> Code <- gsub('[^\x20-\x7E]', ' ', Code)
> grepl(" ", Parks[1,2], fixed = TRUE)
[1] TRUE
> grepl("National Historical Park", Parks[1,2])
[1] TRUE
Upvotes: 0