Bananach
Bananach

Reputation: 2311

How does a Span survive garbage collection?

I'm pretty convinced that creating a Span from an array doesn't fix (in the sense of the fixed keyword) the underlying array, otherwise there wouldn't be a need for Span.GetPinnableReference and since the lifetime of a Span can extend over unboundedly many allocations and "forgets" of other objects this would likely be a bad performance choice too.

How then does a Span survive Garbage Collection? My mental model of a Span was that it's two integers (address and length) with some lifetime constraints that help the runtime guarantee they are used safely (in the sense of the managed runtime). This can't be right, because the address integer would be useless after Garbage collection. This in turn makes me belief that the Span does, in some deep underlying hidden sense, have a reference to the original array, even though that contradicts everything you read about Spans.

What am I missing here?

Upvotes: 7

Views: 1168

Answers (1)

PMF
PMF

Reputation: 17288

Span<T> uses quite some magic of the runtime to work correctly. But logically, it does, in fact, just contain a reference to the original array (or stack memory, or unmanaged memory). It's just hidden deeply in the runtime.

The fields of Span<T> are:

public readonly ref struct Span<T>
    {
        /// <summary>A byref or a native ptr.</summary>
        internal readonly ByReference<T> _pointer;
        /// <summary>The number of elements this Span contains.</summary>
        private readonly int _length;
    }

The magic part here is ByReference<T>, an internal runtime struct with no visible implementation but lots of documentation about it's magic:

    // ByReference<T> is meant to be used to represent "ref T" fields. It is working
    // around lack of first class support for byref fields in C# and IL. The JIT and
    // type loader has special handling for it that turns it into a thin wrapper around ref T.
    internal readonly ref struct ByReference<T>
    {
        private readonly IntPtr _value;

        [Intrinsic]
        public ByReference(ref T value)
        {
            // Implemented as a JIT intrinsic - This default implementation is for
            // completeness and to provide a concrete error if called via reflection
            // or if intrinsic is missed.
            throw new PlatformNotSupportedException();
        }

        public ref T Value
        {
            // Implemented as a JIT intrinsic - This default implementation is for
            // completeness and to provide a concrete error if called via reflection
            // or if the intrinsic is missed.
            [Intrinsic]
            get => throw new PlatformNotSupportedException();
        }
    }

The actual implementation can't be written directly in C# (as of C# 10) but it would be something as simple as this, when T is a managed type:

    internal readonly ref struct ByReference<T>
    {
        private readonly ref T _value; // A reference to a variable of type T

        public ByReference(ref T value)
        {
            _value = value;
        }

        public ref T Value
        {
            get => _value;
        }
    }

Now of course you need to be extra careful when the referenced object would otherwise be ready for garbage collection to make sure this doesn't cause problems, but since the runtime is involved, it can track the reference and knows that _value is actually a pointer to a managed object. It can then move it around much like it would for an "ordinary" managed reference. Nothing needs to be done if the reference is pointing to the stack, since that one doesn't move.

The whole concept about the use of Span<T> can increase performance, but it is possible to write code that fools the runtime and causes dangling references, even if everything appears to be plain managed code, without any (explicit) unsafe references. See here for an instance of such a problem.

Upvotes: 5

Related Questions