Reputation: 191
I have a text area. I allow entering html markups in that any html code can be entered.
now i want to convert that html code to plain text without using third party tool...how can it be done
currently i am doing it like below:-
var desc = Convert.ToString(Html.Raw(Convert.ToString(drJob["Description"])));
drJob["Description"] is datarow from where I fetch description and I want to convert description to plain text.
Upvotes: 1
Views: 4902
Reputation: 607
using System.Text.RegularExpressions;
private void button1_Click(object sender, EventArgs e)
{
string sauce = htm.Text; // htm = your html box
Regex myRegex = new Regex(@"(?<=^|>)[^><]+?(?=<|$)", RegexOptions.Compiled);
foreach (Match iMatch in myRegex.Matches(sauce))
{
txt.AppendText(Environment.NewLine + iMatch.Value); //txt = your destination box
}
}
Let me know if you need more clarification.
[EDIT:] Be aware that this is not a clean function, so add a line to clean up empty spaces or line breaks. But the actual getting of text from in-between tags should work fine. If you want to save space - use regex and see if this works for you. Although the person who posted about regex not being clean is right, there might be other ways; Regex is usually better when separating a single type of tag from html. (I use it for rainmeter to parse stuff and never had any issues)
Upvotes: 0
Reputation: 359
You can replace html tags with empty string using System.Text.RegularExpressions.Regex
String desc = Regex.Replace(drJob["Description"].ToString(), @"<[^>]*>", String.Empty);
Upvotes: 1
Reputation: 15130
There is no direct way coming from .NET to do this. You either need to resort to a third party tool like HtmlAgilePack- or do this in javascript.
document.getElementById('myTextContainer').innerText = document.getElementById('myMarkupContainer').innerText;
For your safety, dont use a regex. ( http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html )
Upvotes: 2