Reputation: 1
I tried to extract URL but everytime I run my code. It didn't work. What did miss? any help will be great.
x$URL <- gsub("(.*)(http://www.bloomin.com)(.jpg)(.)",
"//2//3", x$Product.Description.)
[1] //2//3
It was what I return. I want to get http://www.blooming.com/image/xxxxxxxx.jpg in return from below vector.
<div>Colorful Floor chair Series</div><div><br /></div><div>Soft
Suede</div><div><br /></div><div>Cute bubble design</div><div><br
/></div><div><p align="center"><p align="center"><img
src="http://gdetail.image-gemkt.com/186/716088198/2010/2/e3b117e2-a7bd-4d.GIF"
/></div><div><p align="center"><p align="center"><img
src="http://www.blooming.com/image/xxxxxxxx.jpg" /></div>
Upvotes: 0
Views: 124
Reputation: 174696
Backreferences must be refered by backslash no forward slash.
Use .*?
(non-greedy) to match all the characters which exists inbetween .com
and the file extension .jpg
x$URL <- gsub("(?s).*\\b(http://www\\.blooming\\.com\\b.*?\\.jpg\\b).*",
"\\1", x$Product.Description.)
Upvotes: 3