Reputation: 325
I'm working on an assignment for class. I need to write something that will return the first row in the table on this webpage (the Barr v. Lee) row: https://www.supremecourt.gov/opinions/slipopinion/19
I've seen other questions that some might consider similar. But they don't look like they're answering my same question. Most other questions it looks like they already have the table on head, rather than pulling it down from a website already.
Or, maybe I just can't see the resemblance. I've been scraping for about a week now.
Right now, I'm trying to build a loop that will go through all the div
elements with an increment counter, and have the counter return a number that tells what the div is for that row so I can drill into it.
This is what I have so far:
for divs in soup_doc:
div_counter = 0
soup_doc.find_all('div')[div_counter]
div_counter = div_counter + 1
print(div_counter)
But right now, it's only returning 1
which I know isn't right. What should I do to fix this? Or is there a better way to go about getting this information?
My output should be:
63
7/14/20
20A8
Barr v. Lee
PC
591/2
Upvotes: 2
Views: 1243
Reputation: 20018
To get the first row, you can use a CSS Selector .in tr:nth-of-type(2) td
:
import requests
from bs4 import BeautifulSoup
URL = "https://www.supremecourt.gov/opinions/slipopinion/19"
soup = BeautifulSoup(requests.get(URL).content, "html.parser")
for tag in soup.select('.in tr:nth-of-type(2) td'):
print(tag.text)
Output:
63
7/14/20
20A8
Barr v. Lee
PC
591/2
Upvotes: 1
Reputation: 499
In your example the div_counter = 0
has to go in front of your loop like this:
div_counter = 0
for divs in soup_doc:
soup_doc.find_all('div')[div_counter]
div_counter = div_counter + 1
print(div_counter)
You always get 1
because you set div_counter
to 0
inside of you for-loop at a beginning of each iteration and than add 1
.
Upvotes: 1