MrJacket
MrJacket

Reputation: 391

How to remove link tag from image using Nokogiri

I'm parsing an HTML document using Nokogiri. The code contain several images like this:

 <a href="http://url_to_big_photo.jpg"><img alt="alternative-text" border="0" height="427" src="http://url_to_my_photo.jpg?" title="Image Title" width="640"></a>

I'm trying to save that image to my S3 storage, change the style and remove the link. All the images have the css tag ".post-body img".

So far, the closest I got is this:

@doc.css(".post-body img").each do |image|
    @new_photo = Photo.create!(
       #Params required to save and upload the photo to S3.
        ...
        ...
       )
     # The url of the photo upload to S3 is @new_photo.photo.url
    image['src']= @new_photo.photo.url
    image['class'] = "my-picture-class"
    image.parent['src] = '#'
    puts image.parent.content
    @doc.to_html
  end

This removes the link to the big photo but obviously it isn't a good solution.

I've tried to replace the parent using image.parent << image as suggested on http://rubyforge.org/pipermail/nokogiri-talk/2009-June/000333.html but doesn't do anything and image.parent = image returns "Could not reparent node (RuntimeError)"

Upvotes: 1

Views: 853

Answers (1)

David Grayson
David Grayson

Reputation: 87406

To convert that mailing list example over to apply to your situation, you have to remember that node is the node they are trying to get rid of, which in your case is image.parent.

So instead of image.parent['src] = '#' you should try:

link = image.parent
link.parent << image
link.remove

Edit:

Actually, the above code would probably move all the images to the bottom of whatever element contains the link, so try this instead:

link = image.parent
link.replace(image)

Upvotes: 1

Related Questions