Reputation: 80
I am trying to check if a Facebook user with the username "abc.def" exists.
echo file_get_contents("https://www.facebook.com/abc.def");
My plan is to use PHP DOM library to read the text on the page.
Instead of redirecting to the user profile, it opens a page that says select your browser.
Please advise.
Upvotes: 1
Views: 2278
Reputation: 718
Requirements:
<?php
$mySearchEngine = '----';//https://programmablesearchengine.google.com/cse/all
$myGoogleApiKey = '----';//https://developers.google.com/custom-search/v1/overview
//$queryExactTerm = 'If you want to use exact name';
//$queryExactUrl = urlencode($queryExactTerm);
$query = 'Name Of the person you want to search';
$queryUrl = urlencode($query);
$pg = 1;//If you do not find the name you want, you can loop and go to next pages, by changing this to 11,21,31...101
$url = 'https://www.googleapis.com/customsearch/v1?';
$url .= 'key='.$myGoogleApiKey;
$url .= '&cx='.$mySearchEngine;
$url .= '&q='.$queryUrl;
$url .= '&start='.$pg;
//$url .= '&exactTerms='.$queryExactUrl;
$responseJson = curl_no_proxy($url);
$responseArr = json_decode($responseJson, TRUE);
if(isset($responseArr['items'])){
foreach($responseArr['items'] as $key => $value){
echo '<hr>';
$title = $value['htmlTitle'];
$link = $value['link'];
echo $title.'<br>';
echo $link.'<br>';
//print_r($value);
//You can get all information you want from here
}
}
//If you want to see all the results
//foreach($responseArr as $key => $value){
//echo '<hr>';
//echo $key.'<br>';
//print_r($value);
//}
function curl_no_proxy($url){
$ch = CURL_INIT();
CURL_SETOPT($ch, CURLOPT_URL, $url );
CURL_SETOPT($ch, CURLOPT_POST, 0);
CURL_SETOPT($ch, CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($ch, CURLOPT_FOLLOWLOCATION,True);
CURL_SETOPT($ch, CURLOPT_ENCODING, 'gzip, deflate');
CURL_SETOPT($ch, CURLOPT_CONNECTTIMEOUT,30);
CURL_SETOPT($ch, CURLOPT_TIMEOUT,30);
$result = CURL_EXEC($ch);
CURL_CLOSE($ch);
return $result;
}
?>
Upvotes: 0
Reputation: 718
enter image description hereThe code below works for me. But I believe that after doing this action a lot of time Facebook will probably block your ip. So I suggest you use some proxies to make your system working better and without stopping. One more thing, I really suggest you test the proxies by yourself because It would be easier to find one that works.
$url = 'https://www.facebook.com/profileName/';
$ch = CURL_INIT();
CURL_SETOPT($ch, CURLOPT_URL, $url);
CURL_SETOPT($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0');
CURL_SETOPT($ch, CURLOPT_POST, false);
CURL_SETOPT($ch, CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($ch, CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($ch, CURLOPT_FOLLOWLOCATION,True);
CURL_SETOPT($ch, CURLOPT_ENCODING, 'gzip, deflate');
CURL_SETOPT($ch, CURLOPT_CONNECTTIMEOUT,30);
CURL_SETOPT($ch, CURLOPT_TIMEOUT,30);
echo $result = CURL_EXEC($ch);
If you need to use proxy, You can use like this:
$ip = '185.18.212.227';
$port = '3128';
$proxy = $ip.':'.$port;
/**
* If your proxy Require user or password
*/
$user = '';
$pwd ='';
$credential = $user.':'.$pwd;
$url = 'https://www.facebook.com/profileName/';
$ch = CURL_INIT();
CURL_SETOPT($ch, CURLOPT_URL, $url);
CURL_SETOPT($ch, CURLOPT_PROXY, $ip);
CURL_SETOPT($ch, CURLOPT_PROXYPORT, $port);
//CURL_SETOPT($ch, CURLOPT_PROXY, $proxy);//This one also can be used instead of the 2 lines above
//CURL_SETOPT($ch, CURLOPT_PROXYUSERPWD, $credential); In case of password is required to access
CURL_SETOPT($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0');
CURL_SETOPT($ch, CURLOPT_POST, false);
CURL_SETOPT($ch, CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($ch, CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($ch, CURLOPT_FOLLOWLOCATION,True);
CURL_SETOPT($ch, CURLOPT_ENCODING, 'gzip, deflate');
CURL_SETOPT($ch, CURLOPT_CONNECTTIMEOUT,30);
CURL_SETOPT($ch, CURLOPT_TIMEOUT,30);
echo $result = CURL_EXEC($ch);
The next part will be divided into basically 5 Parts: 1- I will Get all the free proxies in the first page of the https://free-proxy-list.net/ 2- Save those proxies in an array with keys being the proxyIp and values as the proxyPorts 3- Check if those proxies are really working by scraping data from google.com, if it is working we would get something, If not working we would get nothing 4-Save the proxies that are working in order to use them to test if Facebook already block or not those IPs like they probably did to yours 5- Use to curl Facebook.
Just remember, what I am doing will take some time because I am opening like 30 to 40 proxies to open one page. So I suggest that you have a CronJob doing the part of getting proxies from multiple websites for free, and testing them to see if they work, and you would just be using them to scrape data that would be the optimum method I believe. Also To use the next script you need to have simple_html_dom, you can get here: https://simplehtmldom.sourceforge.io I am using it just to collect the proxies I am using.
ini_set("memory_limit",-1);
ini_set('MAX_EXECUTION_TIME', 0);
include 'simple_html_dom.php';
$goodProxy = array();
//1- Get all the free proxies in the first page of the https://free-proxy-list.net/
$url = 'https://free-proxy-list.net/';
$ch = CURL_INIT();
CURL_SETOPT($ch, CURLOPT_URL, $url);
CURL_SETOPT($ch, CURLOPT_FOLLOWLOCATION,True);
CURL_SETOPT($ch, CURLOPT_RETURNTRANSFER, TRUE);
CURL_SETOPT($ch, CURLOPT_CONNECTTIMEOUT,30);
CURL_SETOPT($ch, CURLOPT_TIMEOUT,30);
$result = CURL_EXEC($ch);
$html = new simple_html_dom();
$html ->load($result);
$table = $html->find('table',0);
$line = $table->find('tbody',0)->find('tr');
$ipProxyPorts = array();
foreach($line as $keys => $value){
$ip = $value->find('td',0)->plaintext;
$port = $value->find('td',1)->plaintext;
$ipProxyPorts[$ip] = $port;
}
print_r($ipProxyPorts);
echo '<br><br><br><br><br><br>++++++++++++++++++++++++++++++++++++<br><br><br><br><br><br>';
//2- Got all the ips in the first page and they are set inside an array :$ipProxyPorts has the keys as the ip and the values as the ports
//3- Next we will just check if those proxies are correct or if they can be used or not by scraping the first page of Google
//This part might take some time that is why it is important you set memory limit and max executing time to infinite on the top of the page
foreach($ipProxyPorts as $ip => $port){
$url = 'https://www.google.com';
$ch = CURL_INIT();
CURL_SETOPT($ch, CURLOPT_URL, $url);
CURL_SETOPT($ch, CURLOPT_PROXY, $ip);
CURL_SETOPT($ch, CURLOPT_PROXYPORT, $port);
CURL_SETOPT($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0');
CURL_SETOPT($ch, CURLOPT_POST, false);
CURL_SETOPT($ch, CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($ch, CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($ch, CURLOPT_FOLLOWLOCATION,True);
CURL_SETOPT($ch, CURLOPT_ENCODING, 'gzip, deflate');
CURL_SETOPT($ch, CURLOPT_CONNECTTIMEOUT,30);
CURL_SETOPT($ch, CURLOPT_TIMEOUT,30);
echo $result = CURL_EXEC($ch);
$textString = strip_tags($result);
if(strlen($textString) > 20){
$goodProxy[$ip] = $port;
}
curl_close($ch);
echo '<hr>';
}
//4- Here are the proxies that are working right now and you could try to see if facebook already blocked them or not, but at least one of them must be ok to use I believe
echo ' "Proxy:port" that are working right now from the website https://free-proxy-list.net/';
print_r($goodProxy);
//5- Now you can use those proxies to see which one would not be blocked by facebook
foreach($goodProxy as $ip => $port){
echo 'Proxy Used:'.$ip.':'.$port;
$url = 'https://www.facebook.com/hygison/';
$ch = CURL_INIT();
CURL_SETOPT($ch, CURLOPT_URL, $url);
CURL_SETOPT($ch, CURLOPT_PROXY, $ip);
CURL_SETOPT($ch, CURLOPT_PROXYPORT, $port);
CURL_SETOPT($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0');
CURL_SETOPT($ch, CURLOPT_POST, false);
CURL_SETOPT($ch, CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($ch, CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($ch, CURLOPT_FOLLOWLOCATION,True);
CURL_SETOPT($ch, CURLOPT_ENCODING, 'gzip, deflate');
CURL_SETOPT($ch, CURLOPT_CONNECTTIMEOUT,30);
CURL_SETOPT($ch, CURLOPT_TIMEOUT,30);
echo $result = CURL_EXEC($ch);
curl_close($ch);
echo '<hr>';
}
Upvotes: 1