Reputation: 6609
The main problem I'm having is pulling data from tables, but any other general tips would be welcome too. The tables I'm dealing with have roughly 25 columns and varying numbers of rows (anywhere from 5-50).
Currently I am grabbing the table and converting it to an array:
require "watir-webdriver"
b = Watir::Browser.new :chrome
b.goto "http://someurl"
# The following operation takes way too long
table = b.table(:index, 1).to_a
# The rest is fast enough
table.each do |row|
# Code for pulling data from about 15 of the columns goes here
# ...
end
b.close
The operation table = b.table(:index, 5).to_a
takes over a minute when the table has 20 rows. It seems like it should be very fast to put the cells of a 20 X 25 table into an array. I need to do this for over 80 tables, so it ends up taking 1-2 hours to run. Why is it taking so long and how can I improve the speed?
I have tried iterating over the table rows without first converting to an array as well, but there was no improvement in performance:
b.table(:index, 1).rows.each do |row|
# ...
Same results using Windows 7 and Ubuntu. I've also tried Firefox instead of Chrome without a noticeable difference.
Upvotes: 1
Views: 1265
Reputation: 1703
The #1 thing you can do to improve the performance of a script that uses watir is to reduce the number of remote calls into the browser. Each time you locate or operate on a DOM element, that's a call into the browser and can take 5ms or more.
In your case, you can reduce the number of remote calls by doing the work on the browser side via execute_script() and checking the result on the ruby side.
Upvotes: 1
Reputation: 958
When attempting to improve the speed of your code it's vital to have some means of testing execution times (e.g. ruby benchmark). You might also like to look at ruby-prof to get a detailled breakdown of the time spent in each method.
I would start by trying to establish if it's not the to_a
method rather than the table
that's causing the delays on that line of code. Watir's internals (or nokogiri as per jarib's answer) may be quicker.
Upvotes: 0
Reputation: 6058
A quick workaround would be to use Nokogiri if you're just reading data from a big page:
require 'nokogiri'
doc = Nokogiri::HTML.parse(b.table(:index, 1).html))
I'd love to see more detail though. If you can provide a code + HTML example that demonstrates the issue, please file it in the issue tracker.
Upvotes: 6