Reputation: 191
I've reviewed many other stackoverflow questions/answers about how to remove unicode characters from strings, but none of the them seem to be working for me!
Exact problem reproduction:
event = as.data.frame(read_html("https://www.bestfightodds.com/events/ufc-226-miocic-vs-cormier-1447") %>% html_table(fill=T))
event$X5Dimes
As you can see, there are embedded up and down arrows. I'd like to remove them so that only the line remains. For example
"-310<U+25BC>" would become "-310"
I've tried many gsub patterns to remove them -- of my own creation and from other stack overflow answers -- and nothing is working! Some example patterns are below
event$X5Dimes = gsub("<.+>", "", event$X5Dimes)
event$X5Dimes = gsub("\\S+\\s+|-", "", event$X5Dimes)
event$X5Dimes = gsub("^\\s*<U\\+\\w+>\\s*", "", event$X5Dimes)
event$X5Dimes = gsub("\\<U[^\\>]*\\>", "", event$X5Dimes)
Can anyone help? Much appreciated -- losing my mind! Thanks!
Upvotes: 1
Views: 313
Reputation: 6088
Try to do it simply this way:
event$X5Dimes = gsub("▼|▲", "", event$X5Dimes)
Upvotes: 1