Reputation: 135
I am trying to do a token replacement in a html
my untokenised string has multiple <input></input>
tags. I want to replace the name attribute with the token <<VS_USER_NAME>>
for example. But my regex replaces the all the <input>
regardless. Below is a stand alone example.
this is the desired output
<div>username <<VS_USER_NAME>></div><div> </div><div>full name <<VS_USER_FULL_NAME>></div><div> </div><div>password <<VS_USER_PASSWORD>></div><div> </div><div>thanks</div>
Code:
static void Main(string[] args)
{
string text = "<div>username <input class=\"VSField\" contenteditable=\"false\" name=\"VS_USER_NAME\" style=\"background-color: rgb(220,220,200);\">[User Name]</input></div><div> </div><div>full name <input class=\"VSField\" contenteditable=\"false\" name=\"VS_USER_FULL_NAME\" style=\"background-color: rgb(220,220,200);\">[Full Name]</input></div><div> </div><div>password <input class=\"VSField\" contenteditable=\"false\" name=\"VS_USER_PASSWORD\" style=\"background-color: rgb(220,220,200);\">[Password]</input></div><div> </div><div>thanks</div>";
string textTokenised = GetTokenisedText(text, "VS_USER_NAME", "VS_USER_FULL_NAME", "VS_USER_PASSWORD");
}
private static string GetTokenisedText(string untokenised, params string[] tokenKeys)
{
foreach (string tokenKey in tokenKeys)
{
string string2 = GetToken(tokenKey);
string string1 = GetRegex(tokenKey);
untokenised = Regex.Replace(untokenised, string1, string2);
}
return untokenised;
}
private static string GetToken(string tokenKey)
{
return string.Format("<<{0}>>", tokenKey);
}
private static string GetRegex(string tokenKey)
{
return string.Format("()<input([^>]*e*)name=\"{0}\"([^>]*e*)>(.*)</input>", tokenKey);
}
Upvotes: 0
Views: 211
Reputation: 626845
Here is an example how you can do the same with HtmlAgilityPack
:
private static string GetTokenisedText(string untokenised, params string[] tokenKeys)
{
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(untokenised);
var query = doc.DocumentNode.Descendants("input");
foreach (var item in query.ToList())
{
var value = item.GetAttributeValue("name", string.Empty);
if (!string.IsNullOrEmpty(value))
{
var token = tokenKeys.Where(p => p == value).FirstOrDefault();
if (!string.IsNullOrEmpty(token))
{
item.NextSibling.Remove();
var newNode = HtmlAgilityPack.HtmlTextNode.CreateNode(string.Format("{{{{{0}}}}}", token.ToUpper()));
item.ParentNode.ReplaceChild(newNode, item);
}
}
}
return doc.DocumentNode.OuterHtml;
}
Output:
<div>username {{VS_USER_NAME}}</div><div> </div><div>full name {{VS_USER_FULL_NAME}}</div><div> </div><div>password {{VS_USER_PASSWORD}}</div><div> </div><div>thanks</div>
{{
and }}
are preferrable markers to <<
and >>
in an (X)HTML document.
You can install HtmlAgilityPack using the Manage NuGet Packages for Solution menu item when right-clicking your solution.
Upvotes: 1
Reputation: 13640
Your regex is greedy by default .*
.. you have to make it non greedy by adding ?
. Use the following:
return string.Format("()<input([^>]*e*)name=\"{0}\"([^>]*e*)>(.*?)</input>", tokenKey);
↑
Upvotes: 1