Esen
Esen

Reputation: 973

UTF8 string variable in c#

I am using PostgreSQL to power a C# desktop application. When I use the PgAdmin query analyzer to update a text column with a special character (like the copyrights trademarks) it works pefectly:

update table1 set column1='value with special character ©' where column2=1

When I use this same query from my C# application, it throws an error:

invalid byte sequence for encoding

After researching this issue, I understand that .NET strings use the UTF-16 Unicode encoding.

Consider:

string sourcetext = "value with special character ©";
// Convert a string to utf-8 bytes.
byte[] utf8Bytes = System.Text.Encoding.UTF8.GetBytes(sourcetext);

// Convert utf-8 bytes to a string. 
string desttext = System.Text.Encoding.UTF8.GetString(utf8Bytes);

The problem here is both the sourcetext and desttext are encoded as UTF-16 strings. When I pass desttext, I still get the exception.

I've also tried the following without success:

Encoder.GetString, BitConverter.GetString

Edit: I even tried this and doesn't help:

unsafe
{
  String utfeightstring = null;
  string sourcetext = "value with special character ©";
  Console.WriteLine(sourcetext);
  // Convert a string to utf-8 bytes. 
  sbyte[] utf8Chars = (sbyte[]) (Array) System.Text.Encoding.UTF8.GetBytes(sourcetext); 
  UTF8Encoding encoding = new UTF8Encoding(true, true);

  // Instruct the Garbage Collector not to move the memory
  fixed (sbyte* pUtf8Chars = utf8Chars)
  {
    utfeightstring = new String(pUtf8Chars, 0, utf8Chars.Length, encoding);
  }
  Console.WriteLine("The UTF8 String is " + utfeightstring); 
}

Is there a datatype in .NET that supports storing UTF-8 encoded string? Are there alternative ways to handle this situation?

Upvotes: 0

Views: 4086

Answers (3)

user2250475
user2250475

Reputation: 3

Just put in your ConnectionString on the end a "...... ;Unicode=true"

Upvotes: -1

Peter M
Peter M

Reputation: 7503

As per this page from the mono project PostgreSQL they suggest that if you have errors with UTF8 strings that you can set the encoding to unicode in the connection string (if you are using the Npgsql driver):

Encoding: Encoding to be used. Possible values: ASCII(default) and UNICODE. Use UNICODE if you are getting problems with UTF-8 values: Encoding=UNICODE

And I have been looking in the official Npgsql docs and it isn't mentioned. NpgsqlConnection.ConnectionString

Upvotes: 5

gyluo
gyluo

Reputation: 1

I think it may not cause by utf-8 or 16 ,it may cause by de special character,you can replace the char with entity char which like '&amp';

Upvotes: -1

Related Questions