Reputation: 4506
What is a prudent approach to performing multiple String.Replace
without replacing text that has already been replaced. For example, say I have this string:
str = "Stacks be [img]http://example.com/overflowing.png[/img] :/";
A Regex I wrote will match the [img]url[/img]
, and let me replace it with the proper HTML <img>
formatting.
str = "Stacks be <img src=\"http://example.com/overflowing.png\"/> :/";
Afterwards I perform String.Replace
to replace emoticon codes (:/
, :(
,:P
, etc) with <img>
tags. However, there's unintended results:
Intended Result
str = "Stacks be <img src=\"http://example.com/overflowing.png\"/> " +
"<img src=\"emote-sigh.png\"/>";
Actual (and obvious) Result
str = "Stacks be <img src=\"http<img src=\"emote-sigh.png"/> " +
"/example.com/overflowing.png\"/>" +
"<img src=\"emote-sigh.png\"/>";
Unfortunently, with the number of replacements I plan to make, it seems impracticle to try to do it all in a single Regex expression (though I'd imagine that would be the most performant solution). What is a (slower but) more maintainable way to do this?
Upvotes: 0
Views: 225
Reputation: 1069
If you do not want to use any complex Regex than you can e.g. split the text into any kind of container.
You should split based on tokens found in the text: in your case a token is a text between [img] [/img]
(including those [img]
tags), that is [img]http://example.com/overflowing.png[/img]
.
Then you can apply [img]
replace method on these tokens and emoticons replace method on the rest of elements in the aforementioned container. Then you just output a string containing all the container elements.
Below you fill find example contents of such container after the split procedure:
1. "Stacks be "
2. "[img]http://example.com/overflowing.png[/img]"
3. " :/"
To elements 1 & 3 you apply emoticon replace and in case of token element number 2 you apply [img]
replace.
Upvotes: 1
Reputation: 6999
Here is the code which did the replace in my case. And the output is exactly what you want.
str = "Stacks be <img src=\"http://example.com/overflowing.png\"/> :/";
// check if the htmltemplate hold any template then set it or else hide the div data.
if (!String.IsNullOrEmpty(str))
{
divStaticAsset.InnerHtml = str.Replace("[img]", "<img src=\'").
Replace("[/img]", "\'/>") + "<img src=\'emote-sigh.png'/>";
}
Upvotes: 0
Reputation: 5649
Another alternative is to use a sort of a modified Lexer to isolate each of the discrete regions in your text where a certain replacement is warranted and marking that block so that replacements aren't run in it again
Here's an example of how you'd do that:
First, we'll create a class that indicates whether a particular string is used or not
public class UsageIndicator
{
public string Value { get; private set; }
public bool IsUsed { get; private set; }
public UsageIndicator(string value, bool isUsed)
{
Value = value;
IsUsed = isUsed;
}
public override string ToString()
{
return Value;
}
}
Then we'll define a class that represents both how to locate a "token" in your text and what to do when it's been found
public class TokenOperation
{
public Regex Pattern { get; private set; }
public Func<string, string> Mutator { get; private set; }
public TokenOperation(string pattern, Func<string, string> mutator)
{
Pattern = new Regex(pattern);
Mutator = mutator;
}
private List<UsageIndicator> ExtractRegions(string source, int index, int length, out int matchedIndex)
{
var result = new List<UsageIndicator>();
var head = source.Substring(0, index);
matchedIndex = 0;
if (head.Length > 0)
{
result.Add(new UsageIndicator(head, false));
matchedIndex = 1;
}
var body = source.Substring(index, length);
body = Mutator(body);
result.Add(new UsageIndicator(body, true));
var tail = source.Substring(index + length);
if (tail.Length > 0)
{
result.Add(new UsageIndicator(tail, false));
}
return result;
}
public void Match(List<UsageIndicator> source)
{
for (var i = 0; i < source.Count; ++i)
{
if (source[i].IsUsed)
{
continue;
}
var value = source[i];
var match = Pattern.Match(value.Value);
if (match.Success)
{
int modifyIBy;
source.RemoveAt(i);
var regions = ExtractRegions(value.Value, match.Index, match.Length, out modifyIBy);
for (var j = 0; j < regions.Count; ++j)
{
source.Insert(i + j, regions[j]);
}
i += modifyIBy;
}
}
}
}
After taking care of those things, putting something together to do the replacement is pretty simple
public class Rewriter
{
private readonly List<TokenOperation> _definitions = new List<TokenOperation>();
public void AddPattern(string pattern, Func<string, string> mutator)
{
_definitions.Add(new TokenOperation(pattern, mutator));
}
public void AddLiteral(string pattern, string replacement)
{
AddPattern(Regex.Escape(pattern), x => replacement);
}
public string Rewrite(string value)
{
var workingValue = new List<UsageIndicator> { new UsageIndicator(value, false) };
foreach (var definition in _definitions)
{
definition.Match(workingValue);
}
return string.Join("", workingValue);
}
}
In the demo code (below), keep in mind that the order in which pattern or literal expressions are added is important. The things that are added first get tokenized first, so, to prevent the ://
in the url from getting picked off as an emoticon plus a slash, we process the image block first, as it'll contain the url between the tags and be marked as used before the emoticon rule can try to get it.
class Program
{
static void Main(string[] args)
{
var rewriter = new Rewriter();
rewriter.AddPattern(@"\[img\].*?\[/img\]", x => x.Replace("[img]", "<img src=\"").Replace("[/img]", "\"/>"));
rewriter.AddLiteral(":/", "<img src=\"emote-sigh.png\"/>");
rewriter.AddLiteral(":(", "<img src=\"emote-frown.png\"/>");
rewriter.AddLiteral(":P", "<img src=\"emote-tongue.png\"/>");
const string str = "Stacks be [img]http://example.com/overflowing.png[/img] :/";
Console.WriteLine(rewriter.Rewrite(str));
}
}
The sample prints:
Stacks be <img src="http://example.com/overflowing.png"/> <img src="emote-sigh.png"/>
Upvotes: 2
Reputation: 10612
string[] emots = { ":/", ":(", ":)" };
string[] emotFiles = { "emote-sigh", "emot-sad.png", "emot-happy.png" };
string replaceEmots(string val)
{
string res = val;
for (int i = 0; i < emots.Length; i++)
res = res.Replace(emots[i], "<img src=\"" + emotFiles[i] + ".png\"/>");
return res;
}
void button1_click()
{
string str = "Stacks be <img src=\"http://example.com/overflowing.png\"/> :/";
str = replaceEmots(str);
}
Upvotes: 0
Reputation: 331
Here is a code snippet from my old project:
private string Emoticonize(string originalStr)
{
StringBuilder RegExString = new StringBuilder(@"(?<=^|\s)(?:");
foreach (KeyValuePair<string, string> e in Emoticons)
{
RegExString.Append(Regex.Escape(e.Key) + "|");
}
RegExString.Replace("|", ")", RegExString.Length - 1, 1);
RegExString.Append(@"(?=$|\s)");
MatchCollection EmoticonsMatches = Regex.Matches(originalStr, RegExString.ToString());
RegExString.Clear();
RegExString.Append(originalStr);
for (int i = EmoticonsMatches.Count - 1; i >= 0; i--)
{
RegExString.Replace(EmoticonsMatches[i].Value, Emoticons[EmoticonsMatches[i].Value], EmoticonsMatches[i].Index, EmoticonsMatches[i].Length);
}
return RegExString.ToString();
}
Emoticons is a Dictionary where I have stored emoticon codes as a key and the corresponding images for a value.
Upvotes: 0
Reputation: 52185
The most obvious approach would be to use a regular expression to replace whatever text you need. So in short, you could use a regex like so: :/[^/]
to match :/
but not ://
.
You could also use groups to know which pattern you have matched thus allowing you to know what to put.
Upvotes: 3
Reputation: 198324
Unfortunently, with the number of replacements I plan to make, it seems impracticle to try to do it all in a single Regex expression (though I'd imagine that would be the most performant solution). What is a (slower but) more maintainable way to do this?
Might seem so, but isn't. Take a look at this article.
tl;dr: Replace
accepts a delegate as its second argument. So match on a pattern that is a disjunction of all the different things you want to simultaneously replace, and in the delegate use a Dictionary
or a switch
or a similar strategy to select the correct replacement for the current element.
The strategy in the article depends on keys being static strings; if there are regexp operators in keys, the concept fails. There is a better way, by wrapping the keys in capture parentheses, you can just test for the presence of the appropriate capture group to see which brace matched.
Upvotes: 3
Reputation: 3856
you can replace like below
string.replace( string.replace("[img]","<img src=\""),"[/img]","\"/>")
it should work.
Upvotes: 0