Reputation:
I need to parse extremely big XML file (near 50GB), how I can do it with Ruby? It's not possible to split it on chunks, I'v already tried.
Upvotes: 0
Views: 70
Reputation: 1429
I parsed a 40GB file using Nokogiri::XML::Reader
.
Structure of my XML file:
<?xml version="1.0" encoding="utf-8"?>
<posts>
<row Id="4">
<row Id="5">
</posts>
Code:
require 'nokogiri'
fname = "Posts.xml"
xml = Nokogiri::XML::Reader(File.open(fname))
xml.each do |posts|
posts.each do |post|
next if post.node_type == 14 # TYPE_SIGNIFICANT_WHITESPACE
# do something with post
end
end
I think the answer depends on how you plan to use the data. In my case, I simply needed to stream the post nodes.
Upvotes: 2