Reputation: 1605
I'm using this:
$t = "#hashtag #goodhash_tag united states #l33t this";
$queryVariable = "";
if(preg_match_all('/(^|\s)(#\w+)/', $t, $arrHashTags) > 0){
array_filter($arrHashTags);
array_unique($arrHashTags);
$count = count($arrHashTags[2]);
if($count > 1){
$counter = 1;
foreach ($arrHashTags[2] as $strHashTag) {
if (preg_match('/#\d*[a-z_]+/i', $strHashTag)) {
if($counter == $count){
$queryVariable .= $strHashTag;
} else{
$queryVariable .= $strHashTag." and ";
}
$newTest = str_replace($arrHashTags[2],"", $t);
}
$counter = $counter + 1;
}
}
}
echo $queryVariable."<br>"; // this is list of tags
echo $newTest; // this is the remaining text
The output based on the $t
above is:
#hashtag and #goodhash_tag and #l33t
united states this
First problem:
if $t = '#hashtag#goodhash_tag united states #l33t this';
i.e without space between two tags, the output becomes:
#hashtag and #l33t
#goodhash_tag united states this
Second problem:
if $t = '#hashtag #goodhash_tag united states #l33t this #123';
i.e with an invalid tag #123
it somehow disturbs my list of tags extracted in $queryVariable
like the output becomes
#hashtag and #goodhash_tag and #l33t and // note the extra 'and'
united states this
Please help on these two if anyone?
Upvotes: 3
Views: 769
Reputation: 80649
Instead of using so many comparisions etc. for your regex. You can simply have the following:
$t = "#hashtag #goodhash_tag united states #l33t this #123#tte#anothertag sth";
$queryVariable = "";
preg_match_all('/(#[A-z_]\w+)/', $t, $arrHashTags);
print_r( $arrHashTags[1] );
To get them as string with and
joining them, you can use implode.
$queryVariable = implode( $arrHashTags[1], " and " );
For the remaining text, you can have preg_replace
or str_replace
(whichever you are comfortable with).
Here is the codepad link.
Upvotes: 5