Sharath
Sharath

Reputation: 2428

Batch Processing in MySql and Node.js

I have a set of 100 Rss links where I parse it every 5/30/45 minutes. So most of the time the records received might be same but yes surely there might be some additional records added so it might vary.

Records shouldn't be repeated in database (no duplicate records). If the record already exists, check whether the record is same, if it is different then update it else reject it, if not exits then insert.

Possible Ways:

  1. From node.js in a loop keep check and insert which will really kill the application since there are more records.
  2. Write a stored procedure.
  3. Batch Processing.

I don't have idea about batch processing so if someone can please share the information regarding the batch processing In mysql. How do upload the bulk data with some sample code it will be very helpful.

Upvotes: 0

Views: 1141

Answers (1)

Wouter
Wouter

Reputation: 776

If, like regular RSS feeds, your feed only adds new records and doesn't change existing ones, I think a straight-forward solution would be to:

  1. Retrieve the latest stored record of this feed from the MySQL database

  2. Go through the records in the RSS feed, starting with the most recent one and moving back in time

  3. Match the records in the feed with the one you retrieved from the MySQL database. If it matches, stop going through the feed and add the records newer than this one to the database.

Depending on the update frequency, this should not keep your app busy for very long per feed.

On the other hand, if you want to account for edits in existing records you could:

  • Use streams to parse and asynchronously process the data immediatey as you're loading it.

  • If the feeds aren't very large in size, you could parse the records into an array and then use a queue to process them one by one. Memory may be a concern here though, if your feeds are relatively large.

Upvotes: 0

Related Questions