KIM
KIM

Reputation: 1254

My scraper does not work on google-compute-engine with 403 forbidden

I wrote a web scraper with python3.6 and it's working well on my own server.

When I'm trying to run it(exactly same URL) on my google-compute-engine, It fails with HTTP Error 403: Forbidden.

My Code and result on my server

Again, It works well.

>>> import urllib.request
>>> from bs4 import BeautifulSoup
>>> response = urllib.request.urlopen("http://www.kumkangho.co.kr/bk.popup.info.php?date=20190413&pa_uid=1")
>>> print(response.readline())
b'<!-- <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> -->\r\n'

Result on google-compute-engine

enter image description here

I think it's not blocked by the server to which url goes but GCE.

Upvotes: 2

Views: 379

Answers (1)

KIM
KIM

Reputation: 1254

After a bunch of test, I found that the opposite server is blocking access from unwanted countries.

I set the http_proxy and it's working now.

@Supratik Majumdar thanks for your help.

Upvotes: 1

Related Questions