Using FeedJira to create RSS aggregator/reader

I am trying to create my own rss reader app in ruby on rails. I want to be able to store various news stories in my database that I can pull from later to display each story with its headline, image, summary, etc. in a nice layout. I am working with the feedjira library and am also pretty new to RoR. I know that these two commands in the rails console fetch rss feeds and somehow parse them:

urls = %w[http://feedjira.com/blog/feed.xml https://github.com/feedjira/feedjira/feed.xml]
feeds = Feedjira::Feed.fetch_and_parse urls

While these two commands work on rss feeds, I was wondering how I could configure my database/model and then save the news entries I get from Feedjira into the db. I tried watching the railscast on this issue but it seemed a bit out of date. Any help on this issue would be immensely appreciated! Thanks in advance!

Upvotes: 3

Answers (2)

Matt

Reputation: 3780

Here's one way:

Create a model such as this:

class Entry < ActiveRecord::Base

  attr_accessible :guid, :source_site_id, :url, :title, :summary, :description, :published_at

  def self.update_from_feed(feed_name)
    feed = Feed.find_by_name(feed_name)
    feed_data = Feedjira::Feed.fetch_and_parse(feed.feed_url)
    add_entries(feed_data.entries, feed)
  end

  private
  def self.add_entries(entries, feed)
    entries.each do |entry|
      break if exists? :entry_id => entry.id

        create!(
            :entry_id     => entry.id,
            :feed_id      => feed.id,
            :url          => entry.url,
            :title        => entry.title.sanitize,
            :summary      => entry.summary.sanitize,
            :description  => entry.content.sanitize,
            :published_at => entry.published
        )

      end
    end
  end
end

You can then call this from the cli / cron or whatever with, for example:

rails runner -e development 'Entry.update_from_feed("feedname")'

This runs the update_from_feed method in the context of your Rails app using a separate rails instance (a bit like rails console), but doesn't impact the running Rails instance.

In this example, there's a separate model which has name and feed_urls, so there's a lookup of the url based on the provided name.

This code doesn't use the ability of Feedjira to check for updates, so dupe checking is baked in. (This guthub issue says to avoid using the #update method.

Note that the use of break assumes that new entries are always added to the top of the feed. If you don't trust the feed, then replace break if with unless. The url can be used as an alternative unique id.

Edit:

Here's a version of the update_from_feed method that takes advantage of Feedjira's ability to process multiple feeds:

def self.update_all
  feed_urls = Feed.pluck :feed_url
  feeds = Feedjira::Feed.fetch_and_parse(feed_urls)

  feed_urls.each do |feed_url|
    feed = Feed.find_by_feed_url(feed_url)
    add_entries(feeds[feed_url].entries, feed)
  end
end

pluck returns all the rows of the specified column(s) (:feed_url in this case) in an array. Equally you could change it to accept an array of names, from which it looks up an array of URLs to pass to feedjira.

Finally, if you wanted a self-looping method, you could include:

def self.update_all_periodically(frequency = 15.minutes)
  loop do
    update_all_from_feed
    sleep frequency.to_i
  end
end

Then this:

rails runner -e development 'Feed.update_all_periodically'

won't return until you break the process, and will update all feeds at the default frequency, or that specified as an optional argument.

If you wanted to run the updates asynchronously in your main Rails process, then a background worker such as Sidekiq, Resque or DelayedJob will do the... job. :)

Upvotes: 2

Julien Genestoux

Reputation: 33012

Scheduling the fetching and parsing of al these feeds can be incredibly hard and time consuming, which means you shoud absolutely not do it from inside the Rails app itself. At best, you should do it using an 'offline' script.

You could also simply rely on existing APIs like Superfeedr and its rack middleware.

Upvotes: 0

Using FeedJira to create RSS aggregator/reader

Answers (2)

Related Questions