cgulliver
cgulliver

Reputation: 59

Python URL and List of targets

I am trying to create a python script to scrape a series of subpages on a site and then out put the data to a file. Not sure how to get the variable into the url and then loop through the list. Here is wha I have so far...

import httplib2
h = httplib2.Http('.cache')
s = ['one', 'two', 'three']


def getinfo():
    response, content = h.request('https-www.example.com/<list items>/info', headers={'Connection':'keep-alive'})
    print(content)
    print(response)

for q in range(len(s)):
    getinfo()

Upvotes: 1

Views: 94

Answers (4)

Szymon Zmilczak
Szymon Zmilczak

Reputation: 383

Another option is % formatting:

def getinfo():
    response, content = h.request('https-www.example.com/%s/info' % subpage, headers={'Connection':'keep-alive'})
    print(content)
    print(response)

Upvotes: 0

Matt
Matt

Reputation: 744

Use str.format

import httplib2
h = httplib2.Http('.cache')
s = ['one', 'two', 'three']


def getinfo(subpage):
    response, content = h.request(
        'https-www.example.com/{}/info'.format(subpage), 
        headers={'Connection': 'keep-alive'}
    )
    print(content)
    print(response)

for subpage in s:
    getinfo(subpage)

Upvotes: 2

kvorobiev
kvorobiev

Reputation: 5070

Probably you need something like

import httplib2
h = httplib2.Http('.cache')
s = ['one', 'two', 'three']

def getinfo():
    for elem in s:
        response, content = h.request('https-www.example.com/'+elem+'/info', headers={'Connection':'keep-alive'})
        print(content)
        print(response)

Upvotes: 0

itzMEonTV
itzMEonTV

Reputation: 20369

Try this,

def getinfo(item):
    response, content = h.request('https-www.example.com/'+ str(item) + '/info', headers={'Connection':'keep-alive'})
    print(content)
    print(response)

for q in s:
    getinfo(q)

Upvotes: 0

Related Questions