Reputation: 774
I built an extension to convert HTML formatted text to something better for a list view. It removes all HTML tags except it replaces <h>
and <p>
s with <br />
to keep readability on the list view. It also shortens the text for longer posts. I put it on my razor view with HTML.Raw(model.text)
.
public static string FixHTML(string input, int? strLen)
{
string s = input.Trim();
s = Regex.Replace(s, "</p.*?>", "<br />");
s = Regex.Replace(s, "</h.*?>", "<br />");
s = s.Replace("<br />", "*ret$990^&");
s = Regex.Replace(s, "<.*?>", String.Empty);
s = Regex.Replace(s, "</.*", String.Empty);
s = s.Replace("*ret$990^&", "<br />");
int i = (strLen ?? s.Length);
s = s.Substring(0,(i > s.Length ? s.Length : i));
return(s);
}
PROBLEM: if the last character gets cut off mid <br />
it messes up the displayed text. Example it gets cut off at blah blah blah <br
then the display isnt nice. How can I use REGEX (or even string replace) to find only the last occurence of <b
.... and only if it doesnt have a closing >
.
I was thinking of something like:
s = string.Format(s.Substring(0, s.Length-6) + Regex.Replace(s.Substring(s.Length - 6), "<.*", string.Empty));
That will probably work but my whole converter seems like it is using a to of code to do something that should be relatively simple.
How can I do this?
Upvotes: 1
Views: 603
Reputation: 4240
Try this:
s = Regex.Replace(s, "(<|<b|<br|<br/)$", "", RegexOptions.None);
Upvotes: 2