Reputation: 55
string str = "our guests will experience \u001favor in an area";
bool exists = str.IndexOf("\u001", StringComparison.CurrentCultureIgnoreCase) > -1;
I want to find and replace this characters \u001 in string.I tried hardly to resolve but still helpless.
Please Resolve this issue. Thanks in advance for your precious help.
Upvotes: 0
Views: 2734
Reputation: 505
Use regexp:
var unicodeRegexp = new Regex(@"\x1f");
var testWord = "our guests will experience \u001favor in an area";
var newWord = unicodeRegexp.Replace(testWord, "text for replacement");
\x1f is the replacement for \uoo1f, leading zeros should be skipped https://www.regular-expressions.info/unicode.html#codepoint
Upvotes: 0
Reputation: 25047
If we look at the C# language specification, ECMA-334, in section 7.4.2 "Unicode character escape sequences", we find
A Unicode escape sequence represents a Unicode code point. Unicode escape sequences are processed in identifiers (§7.4.3), character literals (§7.4.5.5), and regular string literals (§7.4.5.6). A Unicode escape sequence is not processed in any other location (for example, to form an operator, punctuator, or keyword).
unicode-escape-sequence:: \u hex-digit hex-digit hex-digit hex-digit
\U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit
So you have to use four hex digits with the \u
.
In your example, it takes "001f" as those four hex digits.
The "\u001"
in your example should have given an error in Visual Studio along the lines of "Unrecognized escape sequence."
Upvotes: 0
Reputation: 9143
Somewhere, deep inside C# specification, you can find following:
[Note: The use of the \x hexadecimal-escape-sequence production can be error-prone and hard to read due to the variable number of hexadecimal digits following the \x. For example, in the code:
string good = "\x9Good text";
string bad = "\x9Bad text";
it might appear at first that the leading character is the same (U+0009, a tab character) in both strings. In fact the second string starts with U+9BAD as all three letters in the word "Bad" are valid hexadecimal digits. As a matter of style, it is recommended that \x is avoided in favour of either specific escape sequences (\t in this example) or the fixed-length \u escape sequence. end note]
And also:
unicode-escape-sequence::
\u hex-digit hex-digit hex-digit hex-digit
\U hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit hex-digit
To further simplify, \u is followed by 4 or 8 hex symbols - not 3. Your string is interpreted as "our guests will experience \u001favor in an area".
Upvotes: 2