Vaibhav Sinha
Vaibhav Sinha

Reputation: 69

Print just first output line

I have written a code which extracts certain text from a specified url, but it gives me 2 or 3(depending on the webpage) subsequent same output in different lines. I just need to use the first output. How should I do that? This is my code:-

 import requests, re
 from bs4 import BeautifulSoup
 url="http://www.barneys.com/raf-simons-%22boys%22-poplin-shirt-504182589.html#start=2"
 r=requests.get(url)
 soup=BeautifulSoup(r.content)
 links=soup.find_all("a")
 g_d4=soup.find_all("ol", {"class":"breadcrumb"})
 for item in g_d4:
      links_2=soup.find_all('a', href=re.compile('^http://www.barneys.com/barneys-new-york/men/'))
      pattern_2=re.compile("clothing/(\w+)")
      for link in links_2:
          match_1=pattern_2.search(link["href"])
          if match_1:
             print (match_1.group(1))

My output is:

         shirts
         shirts
         shirts

I want my output to be like just:

         shirts

What should I do?

Upvotes: 0

Views: 42

Answers (1)

miki725
miki725

Reputation: 27861

Not sure which of the answers you need so Ill answer both.

unique results

If you want unique results from across the page, you can use sets to do something like:

for item in g_d4:
    links_2=soup.find_all('a', href=re.compile('^http://www.barneys.com/barneys-new-york/men/'))
    pattern_2=re.compile("clothing/(\w+)")
    matches = set()
    for link in links_2:
        match_1=pattern_2.search(link["href"])
        if match_1:
            matches.add(match_1.group(1))
    print(matches)

single result

If you want just the first result in each iteration, you can break within the inner loop:

for item in g_d4:
    links_2=soup.find_all('a', href=re.compile('^http://www.barneys.com/barneys-new-york/men/'))
    pattern_2=re.compile("clothing/(\w+)")
    for link in links_2:
        match_1=pattern_2.search(link["href"])
        if match_1:
            print(match_1.group(1))
            break

Upvotes: 1

Related Questions