CSharpdocsz
CSharpdocsz

Reputation: 81

BeautifulSoup printing the same results twice

URL = "https://bitcointalk.org/index.php?board=1.0"
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
numberOfPages = 0
currentPage = 0
counter = 1

for blabla in soup.find_all("a" , attrs={"class" : "navPages"})[-2]:  
    numberOfPages = int(blabla.string)
    print("Pages count: " + str(numberOfPages))
  

for i in range(0,numberOfPages):
    URLX = "https://bitcointalk.org/index.php?board=1."+ str(currentPage)
    print(URLX)
    print("------------------------------------------------- Page count is: " + str(counter))
    counter += 1
    currentPage += 20
    page1 = requests.get(URLX)
    soup1 = BeautifulSoup(page1.content, 'html.parser')   
    time.sleep(1.0)
    for random in soup1.find_all("span", attrs={"id": re.compile("^msg")}):
        for b in random.find_all('a', href=True):
            print (b.string)

I'm trying to go through all the pages on the "Bitcoin discussion board" and print the topic's name's from each page. It's working but for some reason, it keeps printing the topic's name twice...while going through different pages. For example:

URL (firstpage): https://bitcointalk.org/index.php?board=1.0

would print its actual content:

ABC123

anotherTopic

Then... even when the URL changes to the second page, it would still print the same topics.

And then the same thing happens for all the other pages. Each page gets printed twice (even though the URL is changing).

Any thoughts? This is my first experience with Python and BeautifulSoup.

Upvotes: 0

Views: 124

Answers (1)

Krishna Chaurasia
Krishna Chaurasia

Reputation: 9572

The links for the different pages are as follows i.e. they are in increments of .40:

https://bitcointalk.org/index.php?board=1.0
https://bitcointalk.org/index.php?board=1.40
https://bitcointalk.org/index.php?board=1.80
https://bitcointalk.org/index.php?board=1.120

So, it should be currentPage += 40 instead of current currentPage += 20.

Upvotes: 1

Related Questions