Reputation: 269
Switching the code of the debate forum on my website, I am going to change the way quotes are stored in the database. Now I need to come up with a regex to rearrange already submitted posts in my database.
Following is an example of how my current debate post are stored in the database (with quotes in quotes).. Note: I have indented it for the sake of illustration:
Just citing a post
[quote]Text of quote #3
[quote]Text of quote #2
[quote]Text of quote #1
[name]User 1[/name]
[/quote]
[name]User 2[/name]
[/quote]
[name]User 3[/name]
[/quote]
What I would like now, is that the former will be rearranged to look like this:
Just citing a post
[quote:User 3]
Text of quote #3
[quote:User 2]
Text of quote #2
[quote:User 1]
Text of quote #1
[/quote]
[/quote]
[/quote]
Can any of you point me in the direction of how this can be done with regex? I am using PHP.
Thanks in advance, I appreciate all your help :)
Fischer
Upvotes: 3
Views: 329
Reputation: 30715
Don't use a regex for this. What you're talking about is essentially a mutation of XML, and regex is not the right tool for parsing XML. What you need to do is write a parser.
However, what I would suggest is using actual XML instead. It already exists, it's standardized, the syntax is almost exactly the same, and there are already a ton of parsers for it. I'd start here:
Edit: Just to clarify how easily this could become valid XML:
<quote src="User 3">
Text of quote #3
<quote src="User 2">
Text of quote #2
<quote src="User 1">
Text of quote #1
</quote>
</quote>
</quote>
Upvotes: 0
Reputation: 11169
This function will do the job. It recursively reformats from the inner-most quotation to the outer-most:
function reformat($str) {
while (preg_match('#\[quote\](.+)\[name\](.+)\[/name\]\s*\[/quote\]#Us',
$str,
$matches)) {
$str = str_replace($matches[0],
'[quote:'.$matches[2].']'.$matches[1].'[/quote]',
$str);
}
return $str;
}
In action:
$before = "Just citing a post
[quote]Text of quote #3
[quote]Text of quote #2
[quote]Text of quote #1
[name]User 1[/name]
[/quote]
[name]User 2[/name]
[/quote]
[name]User 3[/name]
[/quote]";
echo reformat($before);
Outputs:
Just citing a post
[quote:User 3]Text of quote #3
[quote:User 2]Text of quote #2
[quote:User 1]Text of quote #1
[/quote]
[/quote]
[/quote]
Upvotes: 1
Reputation: 2834
This will do it:
$input = "Just citing a post
[quote]Text of quote #3
[quote]Text of quote #2
[quote]Text of quote #1
[name]User 1[/name]
[/quote]
[name]User 2[/name]
[/quote]
[name]User 3[/name]
[/quote]";
function fix_quotes($string) {
$regexp = '`(\s*)\[quote\]((?:[^\[]|\[(?!quote\]))*?)\[name\](.*?)\[\/name\]\s*\[\/quote\]`';
while (preg_match($regexp, $string)) {
$string = preg_replace_callback($regexp, function($match) {
return $match[1] . '[quote:' . $match[3] . ']' . trim(fix_quotes($match[2])) . $match[1] . '[/quote]';
}, $string);
}
return $string;
}
echo fix_quotes($input);
Results in:
Just citing a post
[quote:User 3]Text of quote #3
[quote:User 2]Text of quote #2
[quote:User 1]Text of quote #1
[/quote]
[/quote]
[/quote]
Edit: haven't seen that joelhardi already posted similar solution, and his looks a bit cleaner so I'd stick with his solution :)
Upvotes: 1
Reputation: 17129
Because of the complexity involved here (you're going to need conditionals, as well as "Match/Replace All" functionality), I would recommend not doing this in just Regex. Use a programming language with tight Regex functionality, and combine Regex with this language to do what you want. I recommend PHP.
Upvotes: 0