lumos
lumos

Reputation: 223

modifying classes in nokogiri with regex

I'm using nokogiri to remove certain classes like so:

doc.css("span").remove_class("classname-1")

I'm wondering if there's a way to remove classes defined with regex. For instance, if I want to remove a class that has any series of numbers after it, like /classname-\d+/, how can I do that? It doesn't work by directly inserting the regex into the remove_class method, and I can't seem to find this question (or solution!) anywhere.

On a related note, would it also be possible to search for a class using regex, and then replace those classes with a single new class name?

Thank you for any insight/help!

Upvotes: 1

Views: 390

Answers (2)

pguardiario
pguardiario

Reputation: 54984

doc.css('span[class^="classname-"]').each{|el| el.delete 'class'}

is probably good enough, but if nbot maybe:

doc.css('span').select{|el| el[:class][/classname-\d/]}.each{|el| el.delete 'class'}

Upvotes: 1

kiddorails
kiddorails

Reputation: 13014

Unfortunately, with your current approach, I don't think you can do that with an existing method. But hey, when there is a will, there is a way! Have a look at remove_class source :

  def remove_class name = nil
    each do |el|
      if name
        classes = el['class'].to_s.split(/\s+/)
        if classes.empty?
          el.delete 'class'
        else
          el['class'] = (classes - [name]).uniq.join " "
        end
      else
        el.delete "class"
      end
    end
    self
  end

Can you think of how you can make it work for your use case? The method above basically iterates over each element in the NodeSet you acquired from doc.css('span'), it gets the class of each of those element and see if it matches, if it does, it strips of that class.

Method can be patched like this:

module Nokogiri
  module XML
    class NodeSet
      def remove_class name = nil
        name = name.is_a?(String) ? Regexp.new("^#{name}$") : name
        each do |el|
          if name
            classes = el['class'].to_s.split(/\s+/)
            if classes.empty?
              el.delete 'class'
            else
              el['class'] = classes.reject{ |c| c.match name }.uniq.join " "
            end
          else
            el.delete "class"
          end
        end
        self
      end
    end
  end
end


doc.css('span').remove_class(/classname-1/) #=> Equivalent nodeset

Upvotes: 4

Related Questions