Allexj
Allexj

Reputation: 1487

how to let python consider links in a list as a single item

I have this script:

import urllib.request
from bs4 import BeautifulSoup

url= 'https://www.inforge.net/xi/forums/liste-proxy.1118/'
soup = BeautifulSoup(urllib.request.urlopen(url), "lxml")

base = ("https://www.inforge.net/xi/")

for tag in soup.find_all('a', {'class':'PreviewTooltip'}):
    links = (tag.get('href'))
    final = base + links

print (final[0])

which takes every link of the topics in this page.

The problem is that when I print(final[0]) the output is:

h

instead of the entire link. Can someone help me with this?

Upvotes: 2

Views: 60

Answers (1)

Dimitris Fasarakis Hilliard
Dimitris Fasarakis Hilliard

Reputation: 160617

final has a type of str, as such, indexing it in position 0 will result in the first character of the url getting printed, specifically h.

You either need to print all of final if you're using it as a str:

print(final)

or, if you must have a list, make final a list in the for loop by enclosing it in square brackets []:

final = [base + links]

then print(final[0]) will print the first element of the list as you'd expect.


As @Bryan pointed out and I just noticed, it seems like you might be confused about the usage of () in Python. Without a comma , inside the () they do absolutely nothing. If you add the comma, it turns them into tuples (not lists, lists use square brackets []).

So:

base = ("https://www.inforge.net/xi/")

results in base referring to a value of str type while:

base = ("https://www.inforge.net/xi/", )
# which can also be written as:
base =  "https://www.inforge.net/xi/",

results in base referring to a value of tuple type with a single element.

The same applies for the name links:

links = (tag.get('href'))   # 'str'
links = (tag.get('href'), ) # 'tuple'

If you change links and base to be tuples then final is going to end up as a 2 element tuple after final = base + links is executed. So, in this case you should join the elements inside the tuple during your print call:

print ("".join(final))  # takes all elements in final and joins them together 

Upvotes: 4

Related Questions