Alex
Alex

Reputation: 1606

Get related articles based on tags in Ruby

I’m trying to display a related section based on the article’s tags. Any articles that have similar tags should be displayed.

The idea is to iterate the article’s tags and see if any other articles have those tags.
If yes, then add that article to a related = [] array of articles I can retrieve later.


Article A: tags: [chris, mark, scott]
Article B: tags: [mark, scott]
Article C: tags: [alex, mike, john]
Article A has as related the Article B and vice-versa.


Here’s the code:

files = Dir[ROOT + 'articles/*']

# parse file
def parse(fn)
  res = meta(fn)
  res[:body] = PandocRuby.new(body(fn), from: 'markdown').to_html
  res[:pagedescription] = res[:description]
  res[:taglist] = []
  if res[:tags]
    res[:tags] = res[:tags].map do |x|
      res[:taglist] << '<a href="/%s">%s</a>' % [x, x]
      '<a href="/%s">%s</a>' % [x, x]
    end.join(', ')
  end
  res
end

# get related articles
def related_articles(articles)
  related = []
    articles[:tags].each do |tag|
      articles.each do |item|
        if item[:tags] != nil && item[:tags].include?(tag)
          related << item unless articles.include?(item)
        end
      end
    end
  related
end

articles = files.map {|fn| parse(fn)}.sort_by {|x| x[:date]}

articles = related_articles(articles)

Throws this error:

no implicit conversion of Symbol into Integer (TypeError)

Another thing I tried was this:

# To generate related articles
def related_articles(articles)
  related = []
  articles.each do |article|
    article[:tags].each do |tag|
      articles.each do |item|
        if item[:tags] != nil && item[:tags].include?(tag)
          related << item unless articles.include?(item)
        end
      end
    end
  end
  related
end

But now the error says:

 undefined method `each' for "<a href=\\"/tagname\\">tagname</a>":String (NoMethodError)

Help a Ruby noob? What am I doing wrong? Thanks!


As an aside to the main question, I tried rewriting the tag section of the code, but still no luck:

  res[:taglist] = []
  if res[:tags]
    res[:tags] = res[:tags].map do |x|
      res[:taglist] << '<a href="/' + x + '">' + x + '</a>'
      '<a href="/' + x + '">' + x + '</a>'
    end.join(', ')
  end

Upvotes: 0

Views: 39

Answers (1)

rewritten
rewritten

Reputation: 16435

In your first attempt, the problem is in articles[:tags]. articles is an array, so you cannot access it using a symbol key.

The second attempt fails because article[:tags] is a string (from the parse function, you get the original tags, transform to HTML and then join). The :taglist key instead contains an array, you could use it.

Finally, the "related" array should be per-article so neither implementation could possibly solve your issue, as both return a single array for all your set of articles.

You probably need a two pass:

def parse(fn)
  res = meta(fn)
  res[:body] = PandocRuby.new(body(fn), from: 'markdown').to_html
  res[:pagedescription] = res[:description]
  res[:tags] ||= []  # and don't touch it
  res[:tags_as_links] = res[:tags].map { |x| "<a href=\"/#{x}\">#{x}</a>" }
  res[:tags_as_string] = res[:tags_as_links].join(', ')
  res
end

articles = files.map { |fn| parse(fn) }

# convert each article into a hash like
# {tag1 => [self], tag2 => [self]}
# and then reduce by merge
taggings = articles
           .map { |a| a[:tags].product([[a]]).to_h }
           .reduce { |a, b| a.merge(b) { |_, v1, v2| v1 | v2 } }


# now read them back into the articles
articles.each do |article|
  article[:related] = article[:tags]
                      .flat_map { |tag| taggings[tag] }
                      .uniq
  # remove the article itself
  article[:related] -= [article]
end

Upvotes: 1

Related Questions