How to extract text from

Question

this is a part of the source code of a bookings web site:

And I want to extract booking.env.b_hotel_id . So that i would get the value of '25523'. How do I achieve this with nokogiri and mechanize?

Hope somebody can help! thanks! :)

Jason · Accepted Answer

require 'mechanize'

agent = Mechanize.new
page = agent.get('http://www.booking.com/hotel/us/solera-by-stay-alfred.html?label=gen173nr-17CAEoggJCAlhYSDNiBW5vcmVmcgV1c19ueYgBAZgBMbgBBMgBBNgBAegBAfgBAg;sid=695d6598485cb1a8fd9e39c5de3878ba;dcid=4;checkin=2015-10-20;checkout=2015-10-21;dist=0;group_adults=2;room1=A%2CA;sb_price_type=total;srfid=cf5d76283b73d34a1d7e0d61cad6974e38a94351X1;type=total;ucfs=1&')

match = agent.page.search("script").text.scan(/^booking.env.b_hotel_id = \'.*\'/)
puts match
puts match[0].split("'")[1]

Output:

booking.env.b_hotel_id = '1202411'
1202411

Pages that helped me figure this out:

http://robdodson.me/crawling-pages-with-mechanize-and-nokogiri/

Parsing javascript function elements with nokogiri

Regular expression - starting and ending with a character string

http://www.rubular.com

How to extract text from <script> tag by using nokogiri and mechanize?

Answers (1)

Related Questions

How to extract text from &lt;script&gt; tag by using nokogiri and mechanize?

Answers (1)

Related Questions

How to extract text from <script> tag by using nokogiri and mechanize?