Reputation: 1511
I am trying to add a string in the middle of an url. Somehow my output looks like this:
http://www.Holiday.com/('Woman',)/Beach
http://www.Holiday.com/('Men',)/Beach
Somehow it should look like this:
http://www.Holiday.com/Woman/Beach
http://www.Holiday.com/Men/Beach
The code which I am using looks like the following:
list = {'Woman','Men'}
url_test = 'http://www.Holiday.com/{}/Beach'
for i in zip(list):
url = url_test.format(str(i))
print(url)
Upvotes: 4
Views: 11115
Reputation: 3
from urllib.request import urlopen
from bs4 import BeautifulSoup as BS
url = "https://www.imdb.com/chart/top?ref_=nv_mv_250"
html = urlopen(url)
url_list = BS(html, 'lxml')
type(url_list)
all_links = url_list.find_all('a', href=re.compile("/title/tt"))
for link in all_links:
print(link.get("href"))
all_urls = link.get("href")
url_test = 'http://www.imdb.com/{}/'
for i in all_urls:
urls = url_test.format(i)
print(urls)
this is the code to scrape the urls of all the 250 movies from the main url.
but the code gives the result as ------
http://www.imdb.com///
http://www.imdb.com/t/
http://www.imdb.com/i/
http://www.imdb.com/t/
http://www.imdb.com/l/
http://www.imdb.com/e/
http://www.imdb.com///
and so on ...
how can i split 'all_urls' using a comma, or how can I make a list of urls in
'all_urls'....
Upvotes: 0
Reputation: 16081
You can try this also, And please don't use list
as a variable name.
lst = {'Woman','Men'}
url_test = 'http://www.Holiday.com/%s/Beach'
for i in lst:
url = url_test %i
print url
Upvotes: 2
Reputation: 37299
Almost there. Just no need for zip
:
items = {'Woman','Men'} # notice that this is a `set` and not a list
url_test = 'http://www.Holiday.com/{}/Beach'
for i in items:
url = url_test.format(i)
print(url)
The purpose of the zip
function is to join several collections by the index if the item. When the zip
joins the values from each collection it places them in a tuple
which it's __str__
representation is exactly what you got.
Here you just want to iterate the items in the collection
Upvotes: 6