Reputation: 335
I am currently reading the ECMA-334 as suggested by a friend that does programming for a living. I am on the section dealing with Unsafe code. Although, I am a bit confused by what they are talking about.
The garbage collector underlying C# might work by moving objects around in memory, but this motion is invisible to most C# developers. For developers who are generally content with automatic memory management but sometimes need fine-grained control or that extra bit of performance, C# provides the ability to write “unsafe” code. Such code can deal directly with pointer types and object addresses; however, C# requires the programmer to fix objects to temporarily prevent the garbage collector from moving them. This “unsafe” code feature is in fact a “safe” feature from the perspective of both developers and users. Unsafe code shall be clearly marked in the code with the modifier unsafe, so developers can't possibly use unsafe language features accidentally, and the compiler and the execution engine work together to ensure 26 8 9BLanguage overview that unsafe code cannot masquerade as safe code. These restrictions limit the use of unsafe code to situations in which the code is trusted.
The example
using System;
class Test
{
static void WriteLocations(byte[] arr)
{
unsafe
{
fixed (byte* pArray = arr)
{
byte* pElem = pArray;
for (int i = 0; i < arr.Length; i++)
{
byte value = *pElem;
Console.WriteLine("arr[{0}] at 0x{1:X} is {2}",
i, (uint)pElem, value);
pElem++;
}
}
}
}
static void Main()
{
byte[] arr = new byte[] { 1, 2, 3, 4, 5 };
WriteLocations(arr);
Console.ReadLine();
}
}
shows an unsafe block in a method named WriteLocations that fixes an array instance and uses pointer manipulation to iterate over the elements. The index, value, and location of each array element are written to the console. One possible example of output is:
arr[0] at 0x8E0360 is 1 arr[1] at 0x8E0361 is 2 arr[2] at 0x8E0362 is 3 arr[3] at 0x8E0363 is 4 arr[4] at 0x8E0364 is 5
but, of course, the exact memory locations can be different in different executions of the application.
Why is knowing the exact memory locations of for example, this array beneficial to us as developers? And could someone explain this ideal in a simplified context?
Upvotes: 6
Views: 4674
Reputation: 13224
In general, the exact memory locations within an "unsafe" block are not so relevant.
As explained in Dai`s answer, when you are using Garbage Collector managed memory, you need to make sure that the data you are manipulating does not get moved (using "fixed"). You generally use this when
In a some cases, you are working with memory that is not managed by the Garbage Collector, some examples of such scenarios are:
Upvotes: 2
Reputation: 155270
The fixed
language feature is not exactly "beneficial" as it is "absolutely necessary".
Ordinarily a C# user will imagine Reference-types as being equivalent to single-indirection pointers (e.g. for class Foo
, this: Foo foo = new Foo();
is equivalent to this C++: Foo* foo = new Foo();
.
In reality, references in C# are closer to double-indirection pointers, it's a pointer (or rather, a handle) to an entry in a massive object table that then stores the actual addresses of objects. The GC not only will clean-up unused objects, but also move objects around in memory to avoid memory fragmentation.
All this is well-and-good if you're exclusively using object references in C#. As soon as you use pointers then you've got problems because the GC could run at any point in time, even during tight-loop execution, and when the GC runs your program's execution is frozen (which is why the CLR and Java are not suitable for Hard Real Time applications - a GC pause can last a few hundred milliseconds in some cases).
...because of this inherent behaviour (where an object is moved during code execution) you need to prevent that object being moved, hence the fixed
keyword, which instructs the GC not to move that object.
An example:
unsafe void Foo() {
Byte[] safeArray = new Byte[ 50 ];
safeArray[0] = 255;
Byte* p = &safeArray[0];
Console.WriteLine( "Array address: {0}", &safeArray );
Console.WriteLine( "Pointer target: {0}", p );
// These will both print "0x12340000".
while( executeTightLoop() ) {
Console.WriteLine( *p );
// valid pointer dereferencing, will output "255".
}
// Pretend at this point that GC ran right here during execution. The safeArray object has been moved elsewhere in memory.
Console.WriteLine( "Array address: {0}", &safeArray );
Console.WriteLine( "Pointer target: {0}", p );
// These two printed values will differ, demonstrating that p is invalid now.
Console.WriteLine( *p )
// the above code now prints garbage (if the memory has been reused by another allocation) or causes the program to crash (if it's in a memory page that has been released, an Access Violation)
}
So instead by applying fixed
to the safeArray
object, the pointer p
will always be a valid pointer and not cause a crash or handle garbage data.
Side-note: An alternative to fixed
is to use stackalloc
, but that limits the object lifetime to the scope of your function.
Upvotes: 8
Reputation: 9341
One of the primary reasons I use fixed
is for interfacing with native code. Suppose you have a native function with the following signature:
double cblas_ddot(int n, double* x, int incx, double* y, int incy);
You could write an interop wrapper like this:
public static extern double cblas_ddot(int n, [In] double[] x, int incx,
[In] double[] y, int incy);
And write C# code to call it like this:
double[] x = ...
double[] y = ...
cblas_dot(n, x, 1, y, 1);
But now suppose I wanted to operate on some data in the middle of my array say starting at x[2] and y[2]. There is no way to make the call without copying the array.
double[] x = ...
double[] y = ...
cblas_dot(n, x[2], 1, y[2], 1);
^^^^
this wouldn't compile
In this case fixed comes to the rescue. We can change the signature of the interop and use fixed from the caller.
public unsafe static extern double cblas_ddot(int n, [In] double* x, int incx,
[In] double* y, int incy);
double[] x = ...
double[] y = ...
fixed (double* pX = x, pY = y)
{
cblas_dot(n, pX + 2, 1, pY + 2, 1);
}
I've also used fixed in rare cases where I need fast loops over arrays and needed to ensure the .NET array bounds checking was not happening.
Upvotes: 2