June
June

Reputation: 3307

Find out why a thread is being blocked in C#

So I have a game where the main thread does the usual update / render logic you'd expect, and a second thread that does some very intense processing. The problem I'm having is that occasionally the main thread will be interrupted and the game will drop below 60FPS. I'm fairly sure that it's being blocked by the other thread, but since there's no explicit locking, I have no way of proving it.

There are a few scenarios I can think up of as to why the the main thread would be blocked by the secondary thread:

I've tried putting in stopwatches and measuring what areas of code take time, but this doesn't really tell me much beyond "the main thread was stopped for 500ms at random"; it doesn't actually tell me if there's locks blocking the main thread for an extended period of time.

Are there any techniques I can use to narrow down the cause of this issue?

----- EDIT -----

These are the results of running the Mono profiler and reporting on lock contentions:

Monitor lock summary
    Lock object 0x7f05190c9fe0: 1 contentions
            0.002126 secs total wait time, 0.002126 max, 0.002126 average
    1 contentions from:
            (wrapper runtime-invoke) object:runtime_invoke_void__this__ (object,intptr,intptr,intptr)
            System.Threading.Thread:StartInternal ()
            System.Threading.Timer/Scheduler:SchedulerThread ()
            (wrapper unknown) System.Threading.Monitor:FastMonitorEnterV4 (object,bool&)
            System.Threading.Monitor:Enter (object,bool&)
            System.Threading.Monitor:TryEnter (object,int,bool&)
            (wrapper managed-to-native) System.Threading.Monitor:try_enter_with_atomic_var (object,int,bool&)
    Lock object 0x7f051910b100: 1 contentions
            0.000628 secs total wait time, 0.000628 max, 0.000628 average
    1 contentions from:
            Ninject.Components.ComponentContainer:Get (System.Type)
            Ninject.Components.ComponentContainer:ResolveInstance (System.Type,System.Type)
            Ninject.Components.ComponentContainer:CreateNewInstance (System.Type,System.Type)
            System.Reflection.ConstructorInfo:Invoke (object[])
            System.Reflection.MonoCMethod:Invoke (System.Reflection.BindingFlags,System.Reflection.Binder,object[],System.Globalization.CultureInfo)
            System.Reflection.MonoCMethod:DoInvoke (object,System.Reflection.BindingFlags,System.Reflection.Binder,object[],System.Globalization.CultureInfo)
            System.Reflection.MonoCMethod:InternalInvoke (object,object[])
            (wrapper managed-to-native) System.Reflection.MonoCMethod:InternalInvoke (System.Reflection.MonoCMethod,object,object[],System.Exception&)
            (wrapper runtime-invoke) <Module>:runtime_invoke_void__this___object (object,intptr,intptr,intptr)
            Ninject.Activation.Caching.ActivationCache:.ctor (Ninject.Activation.Caching.ICachePruner)
            Ninject.Activation.Caching.GarbageCollectionCachePruner:Start (Ninject.Activation.Caching.IPruneable)
            (wrapper remoting-invoke-with-check) System.Threading.Timer:.ctor (System.Threading.TimerCallback,object,int,int)
            System.Threading.Timer:.ctor (System.Threading.TimerCallback,object,int,int)
            System.Threading.Timer:Init (System.Threading.TimerCallback,object,long,long)
            System.Threading.Timer:Change (long,long,bool)
            System.Threading.Timer/Scheduler:Change (System.Threading.Timer,long)
            (wrapper unknown) System.Threading.Monitor:FastMonitorEnterV4 (object,bool&)
            System.Threading.Monitor:Enter (object,bool&)
            System.Threading.Monitor:TryEnter (object,int,bool&)
            (wrapper managed-to-native) System.Threading.Monitor:try_enter_with_atomic_var (object,int,bool&)
    Lock object 0x7f05190ca000: 1 contentions
            0.000347 secs total wait time, 0.000347 max, 0.000347 average
    1 contentions from:
            (wrapper runtime-invoke) object:runtime_invoke_void__this__ (object,intptr,intptr,intptr)
            System.Threading.Thread:StartInternal ()
            System.Threading.Timer/Scheduler:SchedulerThread ()
            (wrapper remoting-invoke-with-check) System.Threading.EventWaitHandle:Reset ()
            System.Threading.EventWaitHandle:Reset ()
            (wrapper unknown) System.Threading.Monitor:FastMonitorEnterV4 (object,bool&)
            System.Threading.Monitor:Enter (object,bool&)
            System.Threading.Monitor:TryEnter (object,int,bool&)
            (wrapper managed-to-native) System.Threading.Monitor:try_enter_with_atomic_var (object,int,bool&)
    Lock contentions: 3
    Lock acquired: 3
    Lock failures: 0

This is from running the game for around 20-30 seconds, under which I observed at least 10 lag spikes. There's only 3 lock contentions in that time, and all of them take less than 16ms to resolve.

Upvotes: 4

Views: 4145

Answers (1)

v.oddou
v.oddou

Reputation: 6775

You are using a garbage collected platform to run a real time application, you cannot expect much predictiveness from this point. The garbage collection system locks all running thread when it runs out of space to clean up dangling objects by walking the 'spanable tree from root objects', using 3 generations of "recentness" of data as a spaning tree discovery optimization system. But in any case, presence or in-existence of synchronization in your main thread will not prevent it from it being stopped from time to time.

Secondly, rendering is blocking on the swap function ("present to screen", in direct 3d terminology), which is a function waiting for the "third oldest frame to be finished rendered, last rendering command list is flushed, and VSync signal is received" before it lets your main thread continue. You can try profiling around your swap call to check if your driver has something to do with your lock ups.

Thirdly, the OS scheduler, like you mentioned, is pre-emptive, every kernel tick, that is between 1 and 15ms long, you can get context switched. If you have a linux more recent (or equal) than kernel V 3.1, you will have the kernel build option FULL_DYN_TICKS which disable the pre-emptive interruption timer when only one task is active on the whole system, but on a managed language I feel like this requirement is unlikely to be met. However, 500ms represents a VERY long time, 33 ticks, which could only happen if you had 33 other tasks running full CPU on the same time with the same prirority. Unlikely as well.

You could have a hardware decision to throttle your CPU for temperature reasons, or GPU as well for that matter.

You could have a composite desktop that shares the graphic cards memory and leaks, which forces the driver to swap textures in main memory from time to time. This kind of bugs happens a lot on linux, especially with "dangerous" desktops like emerald, compiz and the like.

Check another 3D application and see the behavior, stop your worker thread alltogether to see if it helps the sanity of your main thread. Check your small objects allocations, and your old object allocations. First generation garbage collection can be heavy to run.

good luck

Upvotes: 4

Related Questions