Bogdan Gusiev
Bogdan Gusiev

Reputation: 8305

Import 5 millions records to the rails application

We need to import large amount of data(about 5 millions records) to the postgresql db under rails application. Data will be provided in xml format with images inside it encoded with Base64.

Estimated size of the xml file is 40GB. What xml parser can handle such amount of data in ruby?

Thanks.

Upvotes: 1

Views: 878

Answers (3)

Juha Syrjälä
Juha Syrjälä

Reputation: 34281

You'll want to use some kind of SAX parser. SAX parsers do not load everything to memory at once.

I don't know about Ruby parsers but quick googling gave this blog post. You could start digging from there.

You could also try to split the XML file to smaller pieces to make it more manageable.

Upvotes: 3

Ryan Bigg
Ryan Bigg

Reputation: 107728

You could convert the data to CSV and then load it into your database by using your DBMS CSV loading capabilities. For MySQL it's this and for PostgreSQL it's this. I would not use anything built in Ruby to load a 40GB file, it's not too good with memory. Best left to the "professionals".

Upvotes: 1

Sebastian
Sebastian

Reputation: 2786

You should have use XML SAX parser as a Juha said. Libxml is the fastest xml lib for ruby, I think.

Upvotes: 1

Related Questions