Reputation: 764
AntiXssEncoder.HtmlEncode have support only for .Net framework. Can I use WebUtility.HtmlEncode for Antixss as we have our application in .net core 2.1?
Upvotes: 4
Views: 4558
Reputation: 155085
AntiXssEncoder.HtmlEncode have support only for .Net framework. Can I use WebUtility.HtmlEncode for Antixss as we have our application in .net core 2.1?
Correct.
But I want to stress that there is no-such thing as an "anti-XSS HTML-encoder" because all correctly-implemented HTML-encoders will protect your website from XSS attacks when used correctly.
AntiXssEncoder
, but given that at-the-time the main HtmlEncode
implentation was actually buggy and insecure probably might have something to do with it, but that's ancient-history now.)In .NET Core 2.1, you only need to use System.Net.WebUtility.HtmlEncode
.
In other .NET releases (especially historical versions), things are complicated, read on if you dare...
AntiXssEncoder
(aka AntiXss
and AntiXss.Encoder
) exists - and why it's obsolete in 2021:The AntiXssEncoder
class from the AntiXss
NuGet package (aka Microsoft.Security.Application.AntiXss
) is obsolete (and has been since 2014) when it was moved to System.Web.Security.AntiXss
.
AntiXssEncoder
, Encoder
, and AntiXss
are just alternative APIs for the same underlying implementation in Encoder
btw.The AntiXssEncoder
in System.Web.Security.AntiXss
is not available in .NET Core 2.1. However this is not a significant problem:
The original Microsoft.Security.Application.AntiXss
was created because HttpUtility.HtmlEncode
was considered insecure because it did not encode single-apostrophe characters, so XSS attacks were possible against ASP.NET 1.x and ASP.NET 2.x WebForms (.aspx
) pages that used single-apostrophes to delimit HTML attributes that contained user-specified values.
For example:
String userProvidedValue = "bad.gif' onerror='alert()";
<img src='<%= this.Server.HtmlEncode( userProvidedValue ) %>' />
...which will be rendered as:
<img src='bad.gif' onerror='alert()' />
However this issue was fixed in ASP.NET 4.0 when HttpUtility.HtmlEncode
was corrected to also HTML-encode those apostrophes. So the exact same code above will now be rendered as below, which won't show an alert()
:
<img src='bad.gif' onerror='alert()' />
AntiXssEncoder
also supported specifying a list of excluded Unicode code-points or Char
values, this was added because AntiXssEncoder
defaulted to hex-encoding all Char
values (not code-points!) above 0xFFFF, which unfortunately meant that even completely safe text in Arabic, Hebrew, Kanji, etc would be escaped, making the raw HTML almost unreadable and ballooning the output HTML length.
For example the (gibberish) string "لك أن كلا"
would be rendered as "لك أن كلا"
- which isn't good.
Fortunately AntiXssEncoder.MarkAsSafe
can be used to exclude character ranges at the programmer's discretion.
By the time .NET Core 2.1 came out, the System.Net.WebUtility
class (not to be confused with System.Web.HttpUtility
, of course) was improved so that it does not unnecessarily hex-encode high Char
values and it does also HTML-encode apostrophes, hence why AntiXssEncoder
was no-longer needed.
In .NET Core 3.1 (and later, including .NET 5 and .NET 6) things improved further, but also got a bit confusing...
System.Text.Encodings.Web.HtmlEncoder
was added. This is a separate implementation (instead of simply wrapping WebUtility
) which brings back AntiXssEncoder
's ability to exclude ranges of characters from encoding just in case you need that functionality. But it's a bit of an edge-case, imo.
HtmlEncoder.Create(TextEncoderSettings)
with a configured TextEncoderSettings
object with the required char ranges excluded.In .NET Core 3.1, for the sake of back-compat, Microsoft brought back System.Web.HttpUtility
, however this is just another wrapper over WebUtility.HtmlEncode
.
HtmlAttributeEncode
- which does not encode single-apostrophes. There is no good reason to use this method, imo. I'm surprised Microsoft hasn't annotated it with [Obsolete]
, actually.However, in .NET Core (and .NET 5 and later) there isn't any way to HTML encode text such that named entities are used instead of hex-encoded entities (other than <
, >l
and &
).
AntiXssEncoder.HtmlEncode
(both Microsoft.Security
and System.Web.Security
) method had a the useNamedEntities
parameter which involved a large hard-coded table of known entity names, e.g. £
becomes £
instead of 
.
&#nnnn;
-syntax as means of encoding Unicode code-points specifically as opposed to a character-value in some other encoding scheme, whereas previously in HTML4 the spec refers to ISO 10646 (aka UCS) characters which is not Unicode as we know it today. (and I suspect that browsers may have tried to map characters based on the document's encoding/code-page if the page wasn't encoded using Unicode (like Shift-JIS), but I might be wrong).Finally, here's a table comparing the output from all of the different HtmlEncode
methods found in .NET as of 2021:
HtmlEncode
methods available in .NET Framework 4.8HtmlEncode
methods are excluded because they're just wrappers over other implementations:
System.Web.HttpServerUtility
(aka Server.HtmlEncode
) just forwards to HttpUtility.HtmlEncode
.System.Web.UI.HtmlTextWriter.WriteEncodedText
also forwards to HttpUtility.HtmlEncode
.System.Web.HttpUtility.HtmlEncode
:
HttpUtility.HtmlEncode
method just forwards to System.Web.Util.HttpEncoder.Current.HtmlEncode(s)
System.Web.Util.HttpEncoder.**Current**
can be replaced at runtime, which is how an update to ASP.NET 4.x (I forget which) was able to make almost everyone use (the then far-better) AntiXssEncoder
without people needing to change their existing application code. Neat.System.Web.Util.HttpEncoder.**Current**
can point to any compatible implementation, while System.Web.Util.HttpEncoder.**Default**`` is _always_ just a wrapper over
WebUtility.HtmlEncode`.System.Web.Util.HttpEncoder.Default
- as mentioned above, this is just another System.Net.WebUtility
wrapper.# | Input | Code-point(s) | UTF-8 bytes | UTF-16 bytes | System.Net.WebUtility.HtmlEncode |
System.Text.Encodings.Web.HtmlEncoder |
System.Web.Security.AntiXss.AntiXssEncoder.HtmlEncode(false) |
System.Web.Security.AntiXss.AntiXssEncoder.HtmlEncode(true) |
---|---|---|---|---|---|---|---|---|
0 |
abc |
U+0061 U+0062 U+0063 |
61 62 63 |
61 00 62 00 63 00 |
abc |
abc |
abc |
abc |
1 |
< |
U+003C |
3C |
3C 00 |
< |
< |
< |
< |
2 |
> |
U+003E |
3E |
3E 00 |
> |
> |
> |
> |
3 |
& |
U+0026 |
26 |
26 00 |
& |
& |
& |
& |
4 |
" |
U+0022 |
22 |
22 00 |
" |
" |
" |
" |
5 |
' |
U+0027 |
27 |
27 00 |
' |
' |
' |
' |
6 |
Ÿ |
U+009F |
C2 9F |
9F 00 |
Ÿ |
Ÿ |
Ÿ |
Ÿ |
7 |
|
U+00A0 |
C2 A0 |
A0 00 |
  |
  |
  |
|
8 |
ÿ |
U+00FF |
C3 BF |
FF 00 |
ÿ |
ÿ |
ÿ |
ÿ |
9 |
ā |
U+0101 |
C4 81 |
01 01 |
ā |
ā |
ā |
ā |
10 |
~ |
U+007E |
7E |
7E 00 |
~ |
~ |
~ |
~ |
11 |
| `U+007F` | `7F` | `7F 00` | |
 |
 |
 |
||||
12 |
£ |
U+00A3 |
C2 A3 |
A3 00 |
£ |
£ |
£ |
£ |
13 |
ÿ |
U+00FF |
C3 BF |
FF 00 |
ÿ |
ÿ |
ÿ |
ÿ |
14 |
Ḃ |
U+1E02 |
E1 B8 82 |
02 1E |
Ḃ |
Ḃ |
Ḃ |
Ḃ |
15 |
💩 |
U+1F4A9 |
F0 9F 92 A9 |
3D D8 A9 DC |
💩 |
💩 |
💩 |
💩 |
16 |
𣎴 |
U+233B4 |
F0 A3 8E B4 |
4C D8 B4 DF |
𣎴 |
𣎴 |
𣎴 |
𣎴 |
17 |
𣎴 |
U+233B4 |
F0 A3 8E B4 |
4C D8 B4 DF |
𣎴 |
𣎴 |
𣎴 |
𣎴 |
18 |
لك أن كلا |
U+0644 U+0643 U+0020 U+0623 U+0646 U+0020 U+0643 U+0644 U+0627 |
D9 84 D9 83 20 D8 A3 D9 86 20 D9 83 D9 84 D8 A7 |
44 06 43 06 20 00 23 06 46 06 20 00 43 06 44 06 27 06 |
لك أن كلا |
لك أن كلا |
لك أن كلا |
لك أن كلا |
HtmlEncode
methods:This table is included only for computer-archeological reasons. **It does not apply to .NET Framework 4.8, nor any versions of ASP.NET Core and ASP.NET-for-.NET 5 or later.
# | Input | Code-point(s) | UTF-8 bytes | UTF-16 bytes | System.Web.HttpUtility.HtmlEncode (ASP.NET 1.1 and 2.0) |
Microsoft.Security.Application.Encoder.HtmlEncode(false) |
Microsoft.Security.Application.Encoder.HtmlEncode(true) |
---|---|---|---|---|---|---|---|
0 |
abc |
U+0061 U+0062 U+0063 |
61 62 63 |
61 00 62 00 63 00 |
abc |
abc |
abc |
1 |
< |
U+003C |
3C |
3C 00 |
< |
< |
< |
2 |
> |
U+003E |
3E |
3E 00 |
> |
> |
> |
3 |
& |
U+0026 |
26 |
26 00 |
& |
& |
& |
4 |
" |
U+0022 |
22 |
22 00 |
" |
" |
" |
5 |
' |
U+0027 |
27 |
27 00 |
' |
' |
' |
6 |
Ÿ |
U+009F |
C2 9F |
9F 00 |
Ÿ |
Ÿ |
Ÿ |
7 |
|
U+00A0 |
C2 A0 |
A0 00 |
  |
  |
|
8 |
ÿ |
U+00FF |
C3 BF |
FF 00 |
ÿ |
ÿ |
ÿ |
9 |
ā |
U+0101 |
C4 81 |
01 01 |
ā |
ā |
ā |
10 |
~ |
U+007E |
7E |
7E 00 |
~ |
~ |
~ |
11 |
| `U+007F` | `7F` | `7F 00` | |
 |
 |
||||
12 |
£ |
U+00A3 |
C2 A3 |
A3 00 |
£ |
£ |
£ |
13 |
ÿ |
U+00FF |
C3 BF |
FF 00 |
ÿ |
ÿ |
ÿ |
14 |
Ḃ |
U+1E02 |
E1 B8 82 |
02 1E |
Ḃ |
Ḃ |
Ḃ |
15 |
💩 |
U+1F4A9 |
F0 9F 92 A9 |
3D D8 A9 DC |
💩 |
💩 |
💩 |
16 |
𣎴 |
U+233B4 |
F0 A3 8E B4 |
4C D8 B4 DF |
𣎴 |
𣎴 |
𣎴 |
17 |
𣎴 |
U+233B4 |
F0 A3 8E B4 |
4C D8 B4 DF |
𣎴 |
𣎴 |
𣎴 |
18 |
لك أن كلا |
U+0644 U+0643 U+0020 U+0623 U+0646 U+0020 U+0643 U+0644 U+0627 |
D9 84 D9 83 20 D8 A3 D9 86 20 D9 83 D9 84 D8 A7 |
44 06 43 06 20 00 23 06 46 06 20 00 43 06 44 06 27 06 |
لك أن كلا |
لك أن كلا |
لك أن كلا |
HtmlEncode
methods in .NET 5# | Input | Code-point / Runes | UTF-8 bytes | UTF-16 bytes | System.Net.WebUtility.HtmlEncode |
System.Web.HttpUtility.HtmlEncode (.NET 5) |
System.Text.Encodings.Web.HtmlEncoder |
---|---|---|---|---|---|---|---|
0 |
abc |
97 98 99 |
61 62 63 |
61 00 62 00 63 00 |
abc |
abc |
abc |
1 |
< |
60 |
3C |
3C 00 |
< |
< |
< |
2 |
> |
62 |
3E |
3E 00 |
> |
> |
> |
3 |
& |
38 |
26 |
26 00 |
& |
& |
& |
4 |
" |
34 |
22 |
22 00 |
" |
" |
" |
5 |
' |
39 |
27 |
27 00 |
' |
' |
' |
6 |
Ÿ |
159 |
C2 9F |
9F 00 |
Ÿ |
Ÿ |
Ÿ |
7 |
|
160 |
C2 A0 |
A0 00 |
  |
  |
  |
8 |
ÿ |
255 |
C3 BF |
FF 00 |
ÿ |
ÿ |
ÿ |
9 |
ā |
257 |
C4 81 |
01 01 |
ā |
ā |
ā |
10 |
~ |
126 |
7E |
7E 00 |
~ |
~ |
~ |
11 |
| `127` | `7F` | `7F 00` | |
`` |  |
||||
12 |
£ |
163 |
C2 A3 |
A3 00 |
£ |
£ |
£ |
13 |
ÿ |
255 |
C3 BF |
FF 00 |
ÿ |
ÿ |
ÿ |
14 |
Ḃ |
7682 |
E1 B8 82 |
02 1E |
Ḃ |
Ḃ |
Ḃ |
15 |
💩 |
128169 |
F0 9F 92 A9 |
3D D8 A9 DC |
💩 |
💩 |
💩 |
16 |
𣎴 |
144308 |
F0 A3 8E B4 |
4C D8 B4 DF |
𣎴 |
𣎴 |
𣎴 |
17 |
𣎴 |
144308 |
F0 A3 8E B4 |
4C D8 B4 DF |
𣎴 |
𣎴 |
𣎴 |
18 |
لك أن كلا |
1604 1603 32 1571 1606 32 1603 1604 1575 |
D9 84 D9 83 20 D8 A3 D9 86 20 D9 83 D9 84 D8 A7 |
44 06 43 06 20 00 23 06 46 06 20 00 43 06 44 06 27 06 |
لك أن كلا |
لك أن كلا |
لك أن كلا |
Upvotes: 10