Reputation: 67
I'm using Python and Selenium to Scrape a website. Used find_by_element
to find all the values that I need but I've run into something more challenging. The website html show the exactly structure to two different values and I cannot use a simple find_element_by_class
because they have the same classes and ids. I don't want to use xpath or selector because I am iterating this through many "flight-row" divs and it would make thinks more hardcoded.
<div class="flight-row">
<div class="row row-eq-heights">
<div class="col-xs-4 col-md-4 no-padding"><span class="airline-name">gol</span><span class="flight-number">AM-477</span></div>
<div class="col-xs-4 col-md-4">
<div class="flight-timming"><span class="flight-time">06:15</span><span class="flight-destination">IAH</span></div><span class="flight-data">01/10/19</span></div>
<div class="col-xs-4 col-md-4 no-padding">
<div class="duration"><span class="flight-duration">21:25</span><span class="flight-stops" aria-label="Paradas do voo">2 paradas</span></div>
</div>
<div class="col-xs-4 col-md-4">
<div class="flight-timming"><span class="flight-destination">GIG</span><span class="flight-time">05:40</span></div><span class="flight-data">02/10/19</span></div>
</div>
</div>
I wanna get the values from flight-time, flight-destination and flight-data from the both "col-xs-4 col-md-4" divs.
This is a little of my code:
outbound_flights = driver.find_elements_by_css_selector("div[class^='flight-item ']")
for outbound_flight in outbound_flights:
airline = outbound_flight.find_element_by_css_selector("span[class='airline-name']")
Thank you!
Upvotes: 1
Views: 1033
Reputation: 33384
Try the following css selector to get flight-time
, flight-destination
and flight-data
outbound_flights = driver.find_elements_by_css_selector("div.col-xs-4.col-md-4:not(.no-padding)")
for outbound_flight in outbound_flights:
flight_time = outbound_flight.find_element_by_css_selector("div.flight-timming span.flight-time").text
print(flight_time)
flight_destination = outbound_flight.find_element_by_css_selector("div.flight-timming span.flight-destination").text
print(flight_destination)
flight_data = outbound_flight.find_element_by_css_selector("span.flight-data").text
print(flight_data)
06:15
IAH
01/10/19
05:40
GIG
02/10/19
EDITED Answer:
outbound_flights = driver.find_elements_by_css_selector("div.col-xs-4.col-md-4:not(.no-padding)")
flighttime=[]
for outbound_flight in outbound_flights:
flight_time = outbound_flight.find_element_by_css_selector("div.flight-timming span.flight-time").text
print(flight_time)
flighttime.append(flight_time)
flight_destination = outbound_flight.find_element_by_css_selector("div.flight-timming span.flight-destination").text
print(flight_destination)
flight_data = outbound_flight.find_element_by_css_selector("span.flight-data").text
print(flight_data)
departure_time=flighttime[0]
arrival_time=flighttime[1]
print("Departure time :" + departure_time)
print("Arrival time :" + arrival_time)
Upvotes: 1
Reputation: 2545
You can get values by index.
(//*[@class='flight-time'])[1]
and (//*[@class='flight-time'])[2]
Upvotes: 1