Reputation: 85
This one has had me stumped for a couple of days now and I believe I've finally narrowed it down to this block of code. If anyone can tell me how to fix this, and why it is happening it would be awesome.
import urllib2
GetLink = 'http://somesite.com/search?q=datadata#page'
holder = range(1,3)
for LinkIncrement in holder:
h = GetLink + str(LinkIncrement)
ReadLink = urllib2.urlopen(h)
f = open('test.txt', 'w')
for line in ReadLink:
f.write(line)
f.close()
main() #calls function main that does stuff with the file
continue
The problem is it will only write the data from 'http://somesite.com/search?q=datadata#page'
if I do the below the results print correctly.
for LinkIncrement in holder:
h = GetLink + str(LinkIncrement)
print h
The link I am copying does indeed increment in this manner and I am able to open the urls by copying and pasting. Additionally, I have tried this with a while
loop, but always get the same results.
The below code opens 3 tabs with the incremented urls /search?q=datadata#page1
, /search?q=datadata#page2
, and /search?q=datadata#page3
. Just can't make it work in my code.
import webbrowser
import urllib2
h = ''
def tab(passed):
url = passed
webbrowser.open_new_tab(url + '/')
def test():
g = 'http://somesite.com/search?q=datadata#page'
f = urllib2.urlopen(g)
NewVar = 1
PageCount = 1
while PageCount < 4:
h = g + str(NewVar)
PageCount += 1
NewVar += 1
tab(h)
test()
Thanks to Falsetru for helping me figure this out. The website was using json for any pages after the first page.
Upvotes: 2
Views: 167
Reputation: 369274
In the url, the part after #
(fragment identifier) is not passed to web server; Server respond with same content because parts before framents identifier are same.
#something
is handled by browser (javascript). You need to see what happens in javascript.
Upvotes: 2