user2231831
user2231831

Reputation:

Parsing of big file with Ruby

I need to parse extremely big XML file (near 50GB), how I can do it with Ruby? It's not possible to split it on chunks, I'v already tried.

Upvotes: 0

Views: 70

Answers (1)

Martin Velez
Martin Velez

Reputation: 1429

I parsed a 40GB file using Nokogiri::XML::Reader.

Structure of my XML file:

<?xml version="1.0" encoding="utf-8"?>
<posts>
   <row Id="4">
   <row Id="5">
</posts>

Code:

require 'nokogiri'

fname = "Posts.xml"
xml = Nokogiri::XML::Reader(File.open(fname))
xml.each do |posts|
  posts.each do |post|
    next if post.node_type == 14 # TYPE_SIGNIFICANT_WHITESPACE
    # do something with post
  end
end 

I think the answer depends on how you plan to use the data. In my case, I simply needed to stream the post nodes.

Upvotes: 2

Related Questions