Callum Watkins
Callum Watkins

Reputation: 2991

C# convert SecureString to UTF-8 byte[] securely

I'm trying to get a SecureString into the form of a byte[] which I can keep GC pinned, encoded in UTF-8 format. I have been successful in doing this but with UTF-16 (the default encoding), but I can't figure out how to do the encoding conversion without the chance of the GC creating a managed copy of the data somewhere (the data needs to be kept secure).

Here's what I have so far (Context: An algorithm to calculate the hash of a SecureString)

public static byte[] Hash(this SecureString secureString, HashAlgorithm hashAlgorithm)
{
  IntPtr bstr = Marshal.SecureStringToBSTR(secureString);
  int length = Marshal.ReadInt32(bstr, -4);
  var utf16Bytes = new byte[length];
  GCHandle utf16BytesPin = GCHandle.Alloc(utf16Bytes, GCHandleType.Pinned);
  byte[] utf8Bytes = null;

  try
  {
    Marshal.Copy(bstr, utf16Bytes, 0, length);
    Marshal.ZeroFreeBSTR(bstr);
    // At this point I have the UTF-16 byte[] perfectly.
    // The next line works at converting the encoding, but it does nothing
    // to protect the data from being spread throughout memory.
    utf8Bytes = Encoding.Convert(Encoding.Unicode, Encoding.UTF8, utf16Bytes);
    return hashAlgorithm.ComputeHash(utf8Bytes);
  }
  finally
  {
    if (utf8Bytes != null)
    {
      for (var i = 0; i < utf8Bytes.Length; i++)
      { 
        utf8Bytes[i] = 0;
      }
    }
    for (var i = 0; i < utf16Bytes.Length; i++)
    { 
      utf16Bytes[i] = 0;
    }
    utf16BytesPin.Free();
  }
}

What's the best way to do this conversion and am I trying to do it in the correct place as I have it or should I do it earlier somehow? Could this be more memory efficient by skipping the UTF-16 byte[] step entirely?

Upvotes: 2

Views: 1480

Answers (2)

Callum Watkins
Callum Watkins

Reputation: 2991

I've found a way to do this the way I wanted. The code I have here isn't finished (needs better exception handling and memory management in the case of failure), but here it is:

[DllImport("kernel32.dll")]
static extern void RtlZeroMemory(IntPtr dst, int length);

public unsafe static byte[] HashNew(this SecureString secureString, HashAlgorithm hashAlgorithm)
{
  IntPtr bstr = Marshal.SecureStringToBSTR(secureString);
  int maxUtf8BytesCount = Encoding.UTF8.GetMaxByteCount(secureString.Length);
  IntPtr utf8Buffer = Marshal.AllocHGlobal(maxUtf8BytesCount);

  // Here's the magic:
  char* utf16CharsPtr = (char*)bstr.ToPointer();
  byte* utf8BytesPtr  = (byte*)utf8Buffer.ToPointer();
  int utf8BytesCount = Encoding.UTF8.GetBytes(utf16CharsPtr, secureString.Length, utf8BytesPtr, maxUtf8BytesCount);

  Marshal.ZeroFreeBSTR(bstr);
  var utf8Bytes = new byte[utf8BytesCount];
  GCHandle utf8BytesPin = GCHandle.Alloc(utf8Bytes, GCHandleType.Pinned);
  Marshal.Copy(utf8Buffer, utf8Bytes, 0, utf8BytesCount);
  RtlZeroMemory(utf8Buffer, utf8BytesCount);
  Marshal.FreeHGlobal(utf8Buffer);
  try
  {
    return hashAlgorithm.ComputeHash(utf8Bytes);
  }
  finally
  {
    for (int i = 0; i < utf8Bytes.Length; i++)
    {
      utf8Bytes[i] = 0;
    }
    utf8BytesPin.Free();
  }
}

It relies on obtaining pointers to both the original UTF-16 string and a UTF-8 buffer, then using Encoding.UTF8.GetBytes(Char*, Int32, Byte*, Int32) to keep the conversion within unmanaged memory.

Upvotes: 3

Leonardo Trocato
Leonardo Trocato

Reputation: 74

Have you considered calling GC.Collect() after obtaining the hash?

According with the MSDN on GC.Collect:

Forces an immediate garbage collection of all generations. Use this method to try to reclaim all memory that is inaccessible. It performs a blocking garbage collection of all generations.

All objects, regardless of how long they have been in memory, are considered for collection; however, objects that are referenced in managed code are not collected. Use this method to force the system to try to reclaim the maximum amount of available memory.

From what I see in your code, it shouldn't keep any references to the objects used in the conversion. It all should be collected and disposed by the GC.

Upvotes: 0

Related Questions