Hodes
Hodes

Reputation: 895

Ruby Nokogiri Parsing HTML table III

I am using mechanize/nokogiri and need to parse out a HTML with a lot of these tables:

<table width="100%" onclick="javascript:abredown('c7a8e8041a5031f127d5d27f3f071cbb');" class="buscaDestaque" bgcolor="#F7D36A">
  <tr>
    <td rowspan="2" scope="col" style="width:5%"><img src="images/gold.gif" border="0"></td>
    <td scope="col" style="width:45%" class="mais"><b>Community - 2nd Season</b><br />Community - 2&ordf; Temporada<br/><b>Downloads: </b> 2496 <b>Comentários: </b>17<br><b>Avaliação: </b> 10/10</td>
    <td scope="col" style="width:20%">28/03/2011 - 21:07</td>
    <td scope="col" style="width:20%"><a href="javascript:abreinfousuario(1083150)">SubsOTF</a></td>
    <td scope="col" style="width:10%"><img src='images/flag_br.gif' border='0'></td>
  </tr>
  <tr>
    <td colspan="4">Release: <span class="brls">Community.S02E19.HDTV.XviD-LOL/DIMENSION</span></td>
  </tr>
</table>

I want this output

    Community.S02E19.HDTV.XviD-LOL/DIMENSION, ('c7a8e8041a5031f127d5d27f3f071cbb')

Can anyone help me?

Upvotes: 1

Views: 1068

Answers (1)

Phrogz
Phrogz

Reputation: 303520

require 'nokogiri'

html = Nokogiri::HTML html_with_many_tables
results = html.css('table.buscaDestaque').map do |table|
  jsid = table['onclick'][/'(\w+)'/,1]
  brls = table.at_css('.brls').text
  "#{brls}, #{jsid}"
end
p results
#=>["Community.S02E19.HDTV.XviD-LOL/DIMENSION, c7a8e8041a5031f127d5d27f3f071cbb",
#=> "AnotherBRLS, anotherJSID"]

Upvotes: 6

Related Questions