Reputation: 5678
I have some resource-sensitive things to write. I'm wondering if grouping variables together in structs will actually cause memory overhead, compared to just passing those variables together (e.g. as a function parameter).
If so, what's a good way to about creating something that operates on lazy values without getting overhead? e.g. right now I have a IEnumerable<Foo>
, where Foo
is a struct with a few members. I assume that's identical to an IEnumerable<Tuple<[members]>>
, but if creating structs incurs extra overhead, how can I avoid that? Would calling a callback with [members]
be less overhead?
Thanks!
Upvotes: 1
Views: 1218
Reputation: 116526
A Tuple
is a generic class -- a reference type. Thus, when you do
Tuple.Create(a, b, c, ...)
you are allocating and instantiating a class on the managed heap. Having done so, you can then pass around a managed reference to this object, which reference itself will be small - 4 bytes on a 32-bit process or 8 bytes on a 64-bit process. This makes calling quick, but instantiation may be slow. It can also add memory pressure to the garbage collector -- possibly lots of pressure if you are iterating through huge collections of data.
A struct
is a value type. They are always embedded in some larger context -- either the stack, or some class. They are passed around as copies unless boxed, at which point they are copied into the heap. Luckily, when working with generic enumerables of structs or other value types, these types are not boxed -- this was one of the fundamental design requirements of generics in c#. Thus enumerating over structs will result in very little memory pressure but possibly incur additional costs in copying things around.
As for the time to construct a struct, there is no reason that it should take any longer than the time required to construct an otherwise-equivalent class. My observation is that some such as Color
can be slow to construct, but that's because there's lots of code in the constructor or factory method, not because a struct is intrinsically slow. (Though see this article which points out a minor performance issue that can arise when using structs that have read-only fields.)
So, in deciding whether to iterate over structs or Tuples, there's a tradeoff in memory pressure and construction time (higher for Tuple, probably) vs cost of calling and memory copying (higher for struct). My experience is that for enormous collections of small objects (pairs of pointers, say) iterating using a struct
is faster -- but differing details in your software may produce differing results. The only way for you to determine which cost is lower is to actually measure, with a profiler.
However, I'm not sure I would recommend doing that. What I would recommend doing is writing your code in the most natural and straightforward way possible, and not worrying too much about point-optimizations initially. If you already have your struct Foo
which you are passing around as an argument to many methods -- keep using it. If you have lots of code that expects and works with generic Tuples, use those instead. Performance costs for petty data conversions can pile up and swamp any optimizations you do with your iterators.
Upvotes: 4