Reputation: 265
So I've loaded the following to a textfile which i then read into my c# program in a list, I then converted the list to a string. Now I want to decode the string from all HTML but I'm not able to. Someone now how? Here is the text to format:
<p> <span style="font-size: 18px;"><strong>Varifrån kommer den svarta märren i Småland?</strong></span></p>
<p> <span style="font-size: 14px;"><input checked="checked" name="ruta1" type="checkbox" value="Svar 1" /> Från Tyskland</span></p>
<p> <input type="checkbox" />Från Belgien</p>
<p> </p>
<p> <input type="checkbox" /> Från Turkiet</p>
<p> </p>
<p> </p>
<p> </p>
public partial class Form1 : Form
{
string temp = "TextKod.txt";
string line = "";
List<string> texten = new List<string>();
string vetEj;
string hoppSan;
public Form1()
{
InitializeComponent();
StreamReader sr = new StreamReader(temp);
while ((line = sr.ReadLine()) != null)
{
string[] myarray = line.Split('\r');
vetEj = myarray[0];
texten.Add(vetEj);
}
hoppSan = string.Join("\r", texten);
Upvotes: 0
Views: 80
Reputation: 67898
I think what you really want is to encode the string. But either way, add a reference to System.Web
and leverage the HttpUtility
class. To decode:
HttpUtility.HtmlDecode(htmlString);
and to encode:
HttpUtility.HtmlEncode(htmlString);
To get rid of all HTML elements, do this:
var cleanHtml = Regex.Replace(htmlString, "<.*?>", "");
You could modify the Regex to this <.*?>|&.*?;
to get rid of those
elements, but that also matches the å
in Från Tyskland
, so that's up to you.
Upvotes: 1
Reputation: 6490
If you are using .NET 4.0+ you can also use WebUtility.HtmlDecode which does not require an extra assembly reference as it is available in the System.Net namespace.
this could also help
myEncodedString = HttpUtility.HtmlEncode(string);
Upvotes: 0