Reputation: 59
I need some help with one think which I should to do. I have two arrays with urls for example:
$urls = ['https://test.com/', 'http://example.com/', 'https://google.com/'];
$urlsFromOtherSource = ['https://test.com/', 'https://example.com/', 'https://facebook.com/'];
I need to create three arrays of urls there. First of them will have common urls from both arrays. Two others will be the same only that if in this two initials array i have the same url but difference is only in http - https i need to assignet this url only to one array.
So from my example two arrays i need to get arrays in the following way:
$commonUrls = ['https://test.com/']; //becouse i have only this url in two arrays
$urls = ['http://example.com/', 'https://google.com/']; //'http://example.com/ I leave in this array this url and remove from second table becouse in second array i have the same- difference is only in https
$urlsFromOtherSource = ['https://facebook.com/']; //remove from this array https://example.com/ becouse this url is in first array- difference is only in http
I tried to think how can I compare this arrays and catch the difference in http-https but it is not easy for me. My code look like this:
$urls = ['https://test.com/', 'http://example.com/', 'https://google.com/'];
$urlsFromOtherSource = ['https://test.com/', 'https://example.com/', 'https://facebook.com/'];
$commonUrls = array_intersect($urls, $urlsFromOtherSource);//here I have common urls from both arrays
$urls = array_diff($urls, $commonUrls);//I remove from this array urls which i have in common array
$urlsFromOtherSource = array_diff($urlsFromOtherSource, $commonUrls);//I remove from this array urls which i have in common array
foreach ($urlsFromOtherSource as $url) {
$landingPageArray[] = preg_replace(["#^http(s)?://#", "#^www\.#"], ["", ""], $url);
}
foreach ($urls as $url) {
$landingPage = preg_replace(["#^http(s)?://#", "#^www\.#"], ["", ""], $url);
if (in_array($landingPage, $landingPageArray)) {
$httpDifference[] = $url;
}
}
//I havent idea how can I remove from $urlsFromOtherSource urls which I have in $urls array and where difference is only in http-https
$urlsFromOtherSource = array_diff($urlsFromOtherSource, $httpDifference);
So all I need is compare arrays and remove from second array urls which I have in first array and difference between this url is only http-htpps. Maybe someone can help me find some algorithm for that.
UPDATE I need also remove from urlsFromOtherSource if I have this URL in commonUrls:
commonUrls: array(1) {
[0]=>
string(17) "http://www.test.com/"
}
urlsFromOtherSource: array(1) {
[2]=>
string(21) "http://test.com/"
}
So I need remove from urlsFromOtherSource this URL. And make this code automatically compare only landing page whatever it is http://www or www or only http:// I need not compare this in my arrays
Upvotes: 0
Views: 217
Reputation: 4124
You can write your own comparison function using the u-methods, like array_udiff
and array_uintersect
. Use preg_replace
when comparing the urls to ignore the difference with http/https.
$commonUrls = array_intersect($urls, $urlsFromOtherSource);//here I have common urls from both arrays
$urls = array_diff($urls, $commonUrls);
$urlsFromOtherSource = array_udiff(array_diff($urlsFromOtherSource, $commonUrls), $urls, function ($a, $b) {
return strcmp(preg_replace('|^https?://(www\\.)?|', '', $a), preg_replace('|^https?://(www\\.)?|', '', $b));
});
This yields:
commonUrls: array(1) {
[0]=>
string(17) "https://test.com/"
}
urls: array(2) {
[1]=>
string(19) "http://example.com/"
[2]=>
string(19) "https://google.com/"
}
urlsFromOtherSource: array(1) {
[2]=>
string(21) "https://facebook.com/"
}
Upvotes: 2