Reputation: 1619
How can I remove these pesky backslashes in R? I've scoured the web and stackoverflow to try to find a way to get rid of backslashes...no luck.
I've tried a lot of different ways, but I think the only one that I can get working will be to remove every character that is not a number, letter or space using regular expressions and gsub(). Here is my string:
"_kMDItemOwnerUserID = 99kMDItemAlternateNames = ( \"(500) Days of Summer (2009).m4v\")kMDItemAudioBitRate = 163kMDItemAudioChannelCount = 2kMDItemAudioEncodingApplication = \"HandBrake 0.9.4 2009112300\"kMDItemCodecs = ( \"H.264\", AAC, \"QuickTime Text\")"
As you can see it is very messy, with backslashes and quotation marks all over the place. Ultimately, what I want to do is extract the movie name: '(500) Days of Summer (2009)'.
What is a regular expression that will match everything but numbers, letters and spaces?
Thank you very much in advance for your help.
Upvotes: 1
Views: 4321
Reputation: 520968
gsub("[^[:alnum:] ]", "", x)
Try replacing the character class [^[:alnum:] ]
, which will match any character which is not a letter, number, or space:
Full code:
x <- "_kMDItemOwnerUserID = 99kMDItemAlternateNames = ( \"(500) Days of Summer (2009).m4v\")kMDItemAudioBitRate = 163kMDItemAudioChannelCount = 2kMDItemAudioEncodingApplication = \"HandBrake 0.9.4 2009112300\"kMDItemCodecs = ( \"H.264\", AAC, \"QuickTime Text\")"
gsub("[^[:alnum:] ]", "", x)
[1] "kMDItemOwnerUserID 99kMDItemAlternateNames 500 Days of Summer 2009m4vkMDItemAudioBitRate 163kMDItemAudioChannelCount 2kMDItemAudioEncodingApplication HandBrake 094 2009112300kMDItemCodecs H264 AAC QuickTime Text"
Upvotes: 5