Reputation: 3206
So, the situation I'm currently in is a wee bit complicated (for me that is), but I'm gonna give it a try.
I would like to run to a snippet of HTML and extract all the links referring to my own domain. Next I want to append these URL's with a predefined string of GET vars. For example, I want to append '?var1=2&var2=4' to 'http://www.domain.com/page/' thus creating 'http://www.domain.com/page/?var1=2&var2=4'.
The method I'm currently applying is a simple preg_replace function (PHP), but here is when it gets interesting. How do i create valid appended url's when they already have some GET vars at the end? For example, it could create a url like this: 'http://www.domain.com/page/?already=here&another=one?var1=2&var2=4' thus breaking the GET data.
So to conclude, what I'm looking for is a reg exp which can cope with these scenarios, create my extended url and write it back to the HTML snippet.
This is what I have so far:
$sHTML = preg_replace("'href=\"($domainURL.*?[\/$])\"'", 'href="\1' . $appendedTags . '"', $sHTML);
Thanks in advance
Upvotes: 2
Views: 776
Reputation: 83
Also the parse_str wont return any values as shown in the answer rather it takes an array as a param:
$array = array();
parse_str($url,$array);
// $array will contain the ["scheme"] ["host"] etc
just a side note ;)
-- G
Upvotes: 0
Reputation: 83622
In addition to what Elazar Leibovich suggested, I'd parse the query string with parse_str()
, modify the resulting array to my needs and then use http_build_query()
to rebuild the query string. This way you won't have any duplicates within your query string and you don't have to bother yourself with url-encoding your query-parts.
The complete example would then look like (augmenting Elazar Leibovich code):
$broken = parse_url($url);
$query = parse_str($broken['query']);
$query['var1'] = 1;
$query['var2'] = 2;
$broken['query'] = http_build_query($query);
return $broken['scheme'] . '://' . $broken['host'] . $broken['path'] .
'?' . $broken['query'] . '#' . $broken['fragment'];
Upvotes: 4
Reputation: 33593
Regex are not the solution, as somebody said:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
But nevermind that, what I would use, is parse_url, and then append my var1=1&var2=2
to the result query string. Something along the lines of:
$broken = parse_url($url);
$broken['query'] .= '&var1=1&var2=2';
if (strpos($broken,'&')==0) $broken['query'] = substr($broken['query'],1);
return $broken['scheme'].'://'.$broken['host'].$broken['path'].
'?'.$broken['query'].'#'.$broken['fragment'];
If you don't want your variable to appear twice, use also parse_str to break apart the query string.
Upvotes: 3