BrickleRex
BrickleRex

Reputation: 84

Removal of items from list isn't working

I'm working on a pretty cool project but I need help. You see im collecting proxies from sslproxies.org, but sorting these proxies collected from the table into a list without extra info is pretty hard. So far my code isnt working. Hope u guys can help.What I want to do is delete the sixth item in a the list after every two.

f = open("proxies.txt", 'w+')
def getProxy():
    url = "https://www.sslproxies.org"
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text, "html.parser")
    global tlist
    tlist = []
    for tr in soup.find_all('tr'):
        for td in tr.find_all('td'):
            tlist.append(td)
    clist = tlist
    count = 0
    for word in clist:
        count += 1
        if count > 2:
            clist.remove(word)
            count += 1
            if count >= 6:
                count = 0
        else:
            continue
f.write(str(clist))

Upvotes: 0

Views: 89

Answers (2)

Patrick Haugh
Patrick Haugh

Reputation: 61063

Here is a generator that yields two items, then skips six, then yields two more, etc

def skip_six(l):
    for i, x in enumerate(l):
        if i%8 <= 1:
            yield x

You can use this to make a list like

clist = list(skip_six(tlist))

Upvotes: 2

Shijo
Shijo

Reputation: 9731

I believe you want to select first 2 columns. In this case you may want to try something like this with pandas read html. Just note that I can not access the website you mentioned. So i haven't tested this code

import pandas as pd
df=pd.read_html(io ='https://www.sslproxies.org')
print df
print df[['IP Address','Port']] # select the columns that you are interested in

Upvotes: 0

Related Questions