sutch
sutch

Reputation: 1295

HTML parser that is compatible with JRuby?

I'm having a difficult time locating an HTML parser that works with JRuby.

I've become fond of using Nokogiri for HTML parsing, but Nokogiri requires the use of bxml2.dll, which I don't have available on my machine and am not sure that I can ensure that it is available on all users' machines.

I attempted to use another favorite, Scrubyt, but that relies on Mechanize, which also requires Nokogiri.

What Ruby HTML parser do you recommend for use with JRuby?

Upvotes: 1

Views: 319

Answers (2)

Mark Thomas
Mark Thomas

Reputation: 37507

THe pure java version of Nokogiri does not depend on libxml2 or any binary. See http://wiki.github.com/tenderlove/nokogiri/pure-java-nokogiri-for-jruby.

Hpricot is a popular HTML parsing library that has a pure java port as well. The functionality is similar, in fact Hpricot was the parser that popularized using CSS selectors for HTML parsing.

Upvotes: 1

cam
cam

Reputation: 14222

Why not use the pure-java version of nokogiri?

http://github.com/tenderlove/nokogiri/tree/java

Upvotes: 0

Related Questions