Steel Hard
Steel Hard

Reputation: 37

Scraping returning only one value

I wanted to scrape something as my first program, just to learn the basics really but I'm having trouble showing more than one result.

The premise is going to a forum (http://blackhatworld.com), scrape all thread titles and compare with a string. If it contains the word "free" it will print, otherwise it won't.

Here's the current code:

import requests
from bs4 import BeautifulSoup


page = requests.get('https://www.blackhatworld.com/')
content = BeautifulSoup(page.content, 'html.parser')
threadtitles = content.find_all('a', class_='PreviewTooltip')


n=0
for x in range(len(threadtitles)):
    test = list(threadtitles)[n]
    test2 = list(test)[0]
    if test2.find('free') == -1:
        n=n+1
    else:
        print(test2)
        n=n+1

This is the result of running the program: https://i.gyazo.com/6cf1e135b16b04f0807963ce21b2b9be.png

As you can see it's checking for the word "free" and it works but it only shows first result while there are several more in the page.

Upvotes: 0

Views: 72

Answers (2)

petezurich
petezurich

Reputation: 10184

To solve your problem and simplify your code try this:

import requests
from bs4 import BeautifulSoup

page = requests.get('https://www.blackhatworld.com/')
content = BeautifulSoup(page.content, 'html.parser')
threadtitles = content.find_all('a', class_='PreviewTooltip')

count = 0

for title in threadtitles:
    if "free" in title.get_text().lower():
        print(title.get_text())
    else:
        count += 1

print(count)

Bonus: Print value of href:

for title in threadtitles:
    print(title["href"])

See also this.

Upvotes: 1

João Eduardo
João Eduardo

Reputation: 472

By default, strings comparison is case sensitive (FREE != free). To solve your problem, first you need to put test2 in lowercase:

test2 = list(test)[0].lower()

Upvotes: 1

Related Questions