Python BeautifulSoup Getting First 50 Values

Question

I am trying to get some checkbox value from some links. My goal is to get 50 values and separate them with comma before appending to a text file line by line. I've tried a few solutions with my beginners mind with zero success.

Example HTML checkbox code from tagret (mykeyworder.com/keywords?tags=vector&exclude=&language=en):

My complete code:

import requests
import os.path
import io

from bs4 import BeautifulSoup

# It stripe each line and then use the line as URL for scraping 
# and save the result in a text file with the tag name

with open("links.txt", "r") as a_file:
  for line in a_file:
    stripped_line = line.strip()
    data = BeautifulSoup(requests.get(stripped_line).text, "html.parser")
    with open("tags.txt", "a", encoding="utf-8") as f_out:
        for inp in data.select('div.col-md-2 > input[checked=""]'):
            lst=data[0].text.strip().split()
            text=",".join(lst)
            print(text, file=f_out)

Bhavya Parikh · Accepted Answer

import requests
from bs4 import BeautifulSoup
res=requests.get("https://mykeyworder.com/keywords?tags=vector&exclude=&language=en")
soup=BeautifulSoup(res.text,"html.parser")

You can try out with css selector in which it will find all input with checked="" condition and based on that filters out from the soup

data=soup.select('div.col-md-2 > input[checked=""]')
lst=data[0].text.strip().split()
text=",".join(lst)
print(text)

Output:

illustration,design,vector,background,graphic,modern,set,abstract,art,white,card,template,isolated,icon,shape,line,sign,decoration,collection,blue,pattern,vintage,style,banner,symbol,web,summer,poster,element,creative,black,wallpaper,flat,geometric,floral,circle,fashion,digital,travel,trendy,nature,green,cover,invitation,leaf,fun,decor,color,sketch,texture

Python BeautifulSoup Getting First 50 Values

Answers (1)

Related Questions