Charlie
Charlie

Reputation: 129

PHP file_get_contents now gets forbidden error

I'm working on a PHP application that will cycle through a list of URLs checking to see if the link is valid or not. The way I accomplish this is by opening the URL using the php function file_get_contents. I then search for a certain string value within the page source in order to determine if the link is good or bad. So in testing the application, towards the end of the day, whenever I would try to check a URL on this website I would get this message:

failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in...

The message is a bit longer containing information about the location of my code but this part is the part that stood out to me. I'm thinking that maybe the companies router/firewall thinks I'm trying to spam/attack them based off what I have been Googling. I'm wondering if I might be on some permanent "blacklist" or something like that and how would I find out? I wasn't trying to do anything bad. Actually, what I'm doing will help out this company as I'm doing something that will help to generate sales. Total accident :-) I'm going to call the company later and ask them about it.

Upvotes: 0

Views: 2554

Answers (2)

Kaivosukeltaja
Kaivosukeltaja

Reputation: 15735

Many sites block access from user agents that fail to identify themselves. Introduce yourself properly and you're likely to get better service.

ini_set('user_agent', "CharlesUserAgent1.0"); // Anything usually should do as long as it's not blank

EDIT: You may also want to check out cURL, it does a much better job at making HTTP requests than PHP's builtin URL fopen wrappers.

Upvotes: 4

Oliver M Grech
Oliver M Grech

Reputation: 3171

  1. it could be the websites check the user_agent header and then blocks your request.
  2. Some URLs could be having a query string and the file_get_contents might not be able to perform your request as a normal browser would, thus the page you will be requesting could be something which is actually forbidden :/

Browse the URLs manually and see if you get the same error

Upvotes: 0

Related Questions