Pabasara Ranathunga
Pabasara Ranathunga

Reputation: 170

python selenium get all the divs

I am trying to scrape a website and get all the div elements in the HTML. The webpage I am trying to access is a page containing job opportunities where each job is inside a separate div. I am trying to get them all so that I can run them through a for loop and extract their data for each separately.

I haven't coded it yet.

Is there a specific method or way to get them? And also I only want to extract the divs that contain jobs.

Outer HTML

    <div class="row result clickcard" id="p_1d1829a543b1f3a7" data-jk="1d1829a543b1f3a7" data-tn-component="organicJob" data-tu="">
<h2 id="jl_1d1829a543b1f3a7" class="jobtitle">
    <a href="/rc/clk?jk=1d1829a543b1f3a7&amp;fccid=95ea1992d038e9af&amp;vjs=3" target="_blank" rel="noopener nofollow" onmousedown="return rclk(this,jobmap[1],0);" onclick="setRefineByCookie([]); return rclk(this,jobmap[1],true,0);" title="Planner II" class="turnstileLink" data-tn-element="jobTitle">Planner II</a>
    - <span class="new">new</span></h2>
<span class="company">
    Vantage Utility Services</span>

 - <span class="location">Upland, CA</span>
    <table cellspacing="0" cellpadding="0" border="0">
<tbody><tr>
<td class="snip">
<div class="">
    <span class="summary">
            Transportation or giving away of up to 28.5 grams of marijuana, other than concentrated <b>cannabis</b>, or the offering to transport or give away up to 28.5 grams of...</span>
    </div>


<div class="result-link-bar-container">
    <div class="result-link-bar"><span class="date">4 hours ago</span> <span id="tt_set_1" class="tt_set">  -  <a id="sj_1d1829a543b1f3a7" href="#" class="sl resultLink save-job-link " onclick="changeJobState('1d1829a543b1f3a7', 'save', 'linkbar', false, ''); return false;" title="Save this job to my.indeed">save job</a> - <a href="#" id="tog_1" class="sl resultLink more-link " onclick="toggleMoreLinks('1d1829a543b1f3a7'); return false;">more...</a></span><div id="editsaved2_1d1829a543b1f3a7" class="edit_note_content" style="display:none;"></div><script>if (!window['result_1d1829a543b1f3a7']) {window['result_1d1829a543b1f3a7'] = {};}window['result_1d1829a543b1f3a7']['showSource'] = false; window['result_1d1829a543b1f3a7']['source'] = "Vantage Utility Services"; window['result_1d1829a543b1f3a7']['loggedIn'] = false; window['result_1d1829a543b1f3a7']['showMyJobsLinks'] = false;window['result_1d1829a543b1f3a7']['undoAction'] = "unsave";window['result_1d1829a543b1f3a7']['relativeJobAge'] = "4 hours ago";window['result_1d1829a543b1f3a7']['jobKey'] = "1d1829a543b1f3a7"; window['result_1d1829a543b1f3a7']['myIndeedAvailable'] = true; window['result_1d1829a543b1f3a7']['showMoreActionsLink'] = window['result_1d1829a543b1f3a7']['showMoreActionsLink'] || true; window['result_1d1829a543b1f3a7']['resultNumber'] = 1; window['result_1d1829a543b1f3a7']['jobStateChangedToSaved'] = false; window['result_1d1829a543b1f3a7']['searchState'] = "q=Cannabis&amp;fromage=last"; window['result_1d1829a543b1f3a7']['basicPermaLink'] = "https://www.indeed.com"; window['result_1d1829a543b1f3a7']['saveJobFailed'] = false; window['result_1d1829a543b1f3a7']['removeJobFailed'] = false; window['result_1d1829a543b1f3a7']['requestPending'] = false; window['result_1d1829a543b1f3a7']['notesEnabled'] = true; window['result_1d1829a543b1f3a7']['currentPage'] = "serp"; window['result_1d1829a543b1f3a7']['sponsored'] = false;window['result_1d1829a543b1f3a7']['reportJobButtonEnabled'] = false; window['result_1d1829a543b1f3a7']['showMyJobsHired'] = false; window['result_1d1829a543b1f3a7']['showSaveForSponsored'] = false; window['result_1d1829a543b1f3a7']['showJobAge'] = true;</script></div></div>

<div class="tab-container">
    <div id="tt_display_1" class="more-links-container result-tab" style="display:none;"><a class="close-link closeLink" title="Close" href="#" onclick="toggleMoreLinks('1d1829a543b1f3a7'); return false;"></a><div id="more_1" class="more_actions"><ul><li><span class="mat">View all <a href="/q-Vantage-Utility-Services-l-Upland,-CA-jobs.html" rel="nofollow">Vantage Utility Services jobs in Upland, CA</a> - <a href="/l-Upland,-CA-jobs.html">Upland jobs</a></span></li><li><span class="mat">Salary Search: <a href="/salaries/Planner-Salaries,-Upland-CA" onmousedown="this.href = appendParamsOnce(this.href, '?campaignid=serp-more&amp;fromjk=1d1829a543b1f3a7&amp;from=serp-more-nofollow');" rel="&quot;nofollow&quot;">Planner salaries in Upland, CA</a></span></li><li><span class="mat">Learn more about working at <a href="/cmp/Vantage-Utility-Services" onmousedown="this.href = appendParamsOnce(this.href, '?fromjk=1d1829a543b1f3a7&amp;from=serp-more&amp;campaignid=serp-more&amp;jcid=424bbfe9ea0cfaab');">Vantage Utility Services</a></span></li><li><span class="mat">Related forums: <a href="/forum/loc/Upland-California.html">Upland, California</a> - <a href="/forum/job/Planner.html">Planner</a> - <a href="/forum/cmp/Vantage-Utility-Services.html">VANTAGE UTILITY SERVICES</a></span></li></ul></div></div><div class="dya-container result-tab"></div>
    <div class="tellafriend-container result-tab email_job_content"></div>
    <div class="sign-in-container result-tab"></div>
    <div class="notes-container result-tab"></div>
</div>

</td>
</tr>
</tbody></table>
</div>

Upvotes: 2

Views: 17045

Answers (3)

Andersson
Andersson

Reputation: 52685

Try to implement below code:

div_nodes = driver.find_elements_by_css_selector("div.row.result.clickcard")

Let me know if you need more specific selector

Upvotes: 1

Vitaliy Moskalyuk
Vitaliy Moskalyuk

Reputation: 2583

To get all divs you can use:

self.driver.find_elements_by_css_selector('div')

It's also possible to get text from entire page by:

self.driver.find_element_by_css_selector('body').text

But it's really bad idea to get all divs and use for loops. Much better to find proper selector for data, that you want to get from a page, like class/id, and get data just from these elements.

Upvotes: 1

BcK
BcK

Reputation: 2821

You can use

find_elements_by_tag_name('div')

This will return a list of all the div's inside the html.

Upvotes: 4

Related Questions