Reputation: 444

thread safety and immutable relation

It is a common notion that immutable objects are thread safe. Every experienced java developer(or any other oop developer) knows this fact but when it comes to why question many developers says mmmm ooooo i think etc. I think I am one of those developers.

Threads are things that has a purpose. One of them is to change a state of a something. If your thread isn't changing even one thing why would you run a thread like that?

I really want to see a real-life example that makes me say "oo i must really use an immutable object to accomplish thread-safety here"

Upvotes: 1

Answers (3)

user4842163

Reputation:

I really want to see a real-life example that makes me say "oo i must really use an immutable object to accomplish thread-safety here"

I've never found immutability to be essential for multithreading but, on the other hand, all race conditions and deadlocks I've ever encountered in my career related to mutable shared state. It's much easier to appreciate the benefits of immutability in hindsight than in foresight.

That said, it's hard to see how immutability solves the issues with those scenarios because, in many of them, it doesn't solve the immediate drawing board issues. Sometimes you just have a case where two or more threads need to access shared state where both of them need to see the latest version of that shared state. Immutability doesn't solve anything there. However, it can make the bugs resulting from mistakes far less disastrous. Instead of some obscure race condition which only occurs on one in ten thousand users' machines once in a full moon, you might at least be able to get a bug that's much easier to detect/reproduce using immutables in those conceptual scenarios where two or more threads need to share state in a way where both see the most up-to-date versions of the shared data.

Personal Example

However, just as a recent example of where I found immutable data structures, or more specifically persistent data structures, extremely useful for multithreading:

... is rendering and animation threads which don't need to access the latest "mutation". Here in the above example (it's a little bit old from a prototype some years ago on an i3, but I want to avoid showcasing my commercial work on this site to avoid the heat), I'm using a persistent mesh data structure I created which is immutable. Every single frame when a user brushes over the mesh (over 4 million quads), a new mesh is being created (the original cannot be modified because it is immutable). However, the new mesh avoids deep copying data which is not modified.

Threads are things that has a purpose. One of them is to change a state of a something. If your thread isn't changing even one thing why would you run a thread like that?

Where I found the immutable mesh data structure immediately helpful for multithreading is the rendering and animation thread. Both don't need to see the latest version of the scene. They can lag a little bit behind the user's changes as long as they're offering interactive feedback. They don't need need to modify the "scene". They only need to output something to the screen, not to the scene, so they are read-only with respect to their inputs and output to somewhere else, and it's okay if they don't synchronize perfectly with other copies/references/pointers to the source data.

As a result, using this immutable mesh data structure, I was able to just have the render thread copy the entire scene every single frame and then get to work on rendering it while the other threads are free to create new changed meshes all they like. Without having this immutable mesh structure, I would have likely had to put the rendering thread in the same thread as the thread doing the sculpting and changing the mesh, or I would have had to seriously optimize it to deep copy only the relevant parts of the mesh (as well as the rest of the entire scene) for rendering as fast as possible, or potentially even do something elaborate and try to synchronize the rendering data to the mesh data and only update parts of it selectively in a single thread (inside a lock) before the rendering thread can get to work.

With the immutable mesh data structure, all the rendering thread has to do is:

in rendering thread:
    on scene change:
        copy entire scene // this is dirt cheap
        render scene

Even with millions of polygons, vertex positions, edge data, and texture coordinates, a copy of that immutable mesh above takes less than a kilobyte (the original takes over 40 megabytes and that's in spite of using compressed indices, 16-bit half-floats, etc), and that super cheap copying you tend to get with hefty immutable data structures can be really handy for multithreading in those cases where threads don't need to stay in perfect sync (don't need to see the most up-to-date versions of the shared data), and it can also be handy for undo systems, non-destructive editing, instancing, exception-safety, etc.

It all revolves around an immutable array concept that works like this:

John Carmack

I'm kind of a crazy person building immutable data structures in C, but I got inspired ever since I heard a speech from John Carmack who seemed convinced that you could create video games revolving around immutable data structures and pure functional programming. I used to think the overhead involved in immutability (the need to create something new and allocate memory instead of modifying the original memory) would be quite tremendous, yet if John Carmack (somewhat of an idol of mine and we both came from the same programming generation) could envision an AAA video game built around such data structures, I figured I might as well give it a shot since what's good enough for him should be more than good enough for me. VFX is nowhere near as demanding in terms of FPS as AAA games -- artists are usually happy if they can get 30+ FPS with their content (though the content is generally much more generalized and complex than what game engines have to deal with).

Since then I've been exploring those ideas in my current field (visual FX for films and television and so forth) and I can't get results quite as fast for mutation as the mutable data structures I was using before (though I'm not John Carmack), but I can get reasonably close, and as a bonus, now I can more easily create faster renderers, animators, etc (threads which can sort of lag behind a little bit). It has simplified things so, so much (though the biggest simplifications weren't actually to multithreading, but to things like non-destructive editing and instancing). The simplification is OMG. I can't really overstate it. It has literally made tens of thousands of lines of code into one line of code in some cases in numerous places*. Immutability can be the type of thing that makes you reflect back on your career and wonder why you didn't consider using more of it sooner, and also changing the way you reflect back on past design decisions in your career as blunders when you might not have otherwise. And this is coming from an extremely biased and stubborn person who still thinks garbage collection is shite.

In fairness it wasn't exactly trivial to implement these persistent data structures, but the amount of time they took and the code required was far, far exceeded by the amount of time and code they reduced. The benefits far, far exceeded the costs, so to speak, at least in my case because you have this one moderately-complex persistent, thread-safe data structure which is, in exchange, tremendously simplifying over a hundred different places in the system which use it.

Dirt Cheap Copying

Often in these types of contexts you have these threads whose sole purpose is to output something to the screen with a whole lot of intermediate processing involved to deliver those pixels to the screen, while what the user works with can be something else (a copy). It's okay if these two+ copies are slightly out of sync with each other as long as the frames are being delivered fast enough in a way where the user can't tell the difference. So you can have the user working with a copy while the other threads copy things around and get to work on delivering pixels to the screen. Where the immutable data structures are really helpful in my case is that they make the copying dirt cheap. If over 40 megabytes of data (and in VFX, the data can sometimes span gigabytes) had to be copied around in full all the time from thread to thread every single frame in which a user touches anything, the frame rates would start to crawl. With immutable (specifically persistent) data structures, copying becomes dirt cheap and we end up only having to copy kilobytes of data around even to copy epic scenes. That can be fast enough to deliver the desired 60+ FPS.

As another thing where I explored the concepts of immutability originally, I made a particle demo (also with 4 million particles: somehow I like 4 mil). Here it's not necessarily so beneficial to use immutable data structures for particles, but I just wanted something very visual to assess its performance:

And there a new collection of particles is being created every single frame. I was able to do it at over 180 FPS on an i3 using an immutable, persistent data structure. The mutable version of the same demo ran at over 300+ FPS, but keep in mind the particle simulation is very simple (trivial processing for each particle, and the differences wouldn't be this skewed if more complex logic was applied to each particle). I'm not sure if making the data structure immutable storing particles will be that beneficial in the long run (can't immediately think of cases off the top of my head where that helps so much), but it was mainly a visual benchmark since I originally came from a game development background during the 80s and 90s* and my ideas of "acceptable performance" tie to frame rates with a very visual kind of mindset to it. So I like to benchmark things visually this way. Apologies for the GIFs looking like crap. I had a hard time encoding them and had to reduce their frame rates tremendously. The apparent stutters/lags also aren't there in the original.

Actually John Carmack might have put me out of work in gamedev with the advent of 3D games which are so much more expensive to create which put me out of a job as an indie gamedev in the early-mid 90s and moved me to the VFX industry working in films and television and archviz and only content creation for games, but I idolize him anyway -- he always seemed ten steps ahead of me at all given points in time for the past few decades and I'm like a tortoise trying to catch up to him, always half a decade or so behind his thinking. Now he's all over the immutability and functional programming bandwagon, and I don't think he has turned soft at all. I'm willing to gamble that he's on to something really important, at least for gamedev.

Upvotes: 1

Ashqary

Reputation: 61

I think immutability has to do more with the general sense of safety in programming rather than the thread-safety which includes avoiding dead-locking and so on.

The notion of immutability refers to guarantees.

When an immutable object is shared between threads or different modules running in the same thread, the negative side effect of an intentional or unintentional mutation to the shared object in threads or modules is not present. It is therefore safe. There is a guarantee that an object remains in the same state no matter where it resides.

Upvotes: 0

StackNRG

Reputation: 79

Threads may change the state of shared object, but not necessarily all object they have access to should be changed. Usually it's input data to process or object that change code flow. For example configuration is usually immutable to prevent concurrent modification that may lead to confusing inconsistent state.

Upvotes: 0

thread safety and immutable relation

Answers (3)

Related Questions