Reputation: 15345
I have overriden Sharepoint page's Render method to cut out some script tag from the html sent to client browser like this:
protected override void Render(HtmlTextWriter originalWriter)
{
string content = string.Empty;
using (StringWriter stringWriter = new StringWriter())
{
using (HtmlTextWriter htmlWriter = new HtmlTextWriter(stringWriter))
{
//render the page to my temp writer
base.Render(htmlWriter);
htmlWriter.Close();
//get page content that would normally be sent to client
content = stringWriter.ToString();
stringWriter.Close();
}
}
//replace the script tag
Regex regex = new Regex(@"<script>.*RTE_ConvertTextAreaToRichEdit.*<"+"/script>");
content = regex.Replace(content, string.Empty);
//write modified html to the original writer
originalWriter.Write(content);
}
After this change something strange happened: a part of page that usually is in the upper-right corner and says "Welcome XXX" is not displayed properly. When I view the source of the page, this text is writter BEFORE HTML tag - before any html starts. I can't figure out what is going on for last two days.
Have you got any ideas, has anyone had similar problem?
Upvotes: 1
Views: 1428
Reputation: 71
This is a sharePoint issue actually... happens in 2010 and 2013 as well.
if you manipulate the Render the way you have in your sample it is going to gank.
You cant write back to the writer directly. use:
StrinBuilder sb = new StringBuilder();
StringWriter str = new StringWriter(sb);
HtmlTextWriter wrt = new HtmlTextWriter(str);
base.Render(wrt);
wrt.close();
string html = sb.ToString();
html = SomeFunctionManipulatingYourHTML(html).Trim();
if (html.Length >0)
{
Response.Buffer = true;
Response.Clear();
Response.ContentType = "text/html";
Response.Write(html);
Response.Flush();
Response.End();
}
Worked for me... You will still have to populate the top right corner "Welcome Message" as that is populated after Render but at least you can now manipulate it and have clean HTML. I just populated that part of the page using RegEx after the fact.
Upvotes: 0
Reputation:
HTMLAgilityPack is full of bugs, don't use it! If you need simple solution, you can write your own method. Otherwise you it will be much better to use https://github.com/google/gumbo-parser, it has .Net wrapper called gumbo.bindings
Upvotes: 0
Reputation: 6963
You may have some luck using the HTML agility pack. HTML Parsers are better at... parsing... html than regexs are.
http://www.codeplex.com/htmlagilitypack
Upvotes: 2
Reputation: 5195
Have you checked your Regex? Regex are greedy. This means that by default it returns the longest match possible.
So if your HTML looks something like this:
<html>
...
<!-- first script element -->
<script>...RTE_ConvertTextAreaToRichEdit...</script>
<!-- first script element ends -->
<!-- second script element -->
<script>...</script>
<!-- second script element ends -->
...
</html>
The Regex matches all the stuff between the start of the first script element and the end of the second script element. After the replace your output should be:
<html>
...
<!-- first script element -->
<!-- second script element ends -->
...
</html>
You can turn your Regex in an ungreedy or lazy one (find smallest possible match). Add a ? after the * and that should do it:
Regex regex = new Regex(@"<script>.*?RTE_ConvertTextAreaToRichEdit.*?</script>");
This might solve the problem. Look here for more info.
Upvotes: 2