Skip to main content

Thread suspended while holding global mutex

4 replies [Last post]
greg80303
Offline
Joined: 2008-07-03
Points: 0

I am seeing a deadlock in phoneME when trying to attach to the VM using JDB.

When JDB first attaches, the first thing it does is suspend all threads. This seems to be the primary force behind our deadlock. When I attach GDB to the process I can see the state of all native threads and it seems that one of my threads has been suspended while it holds one of the global mutexes in CVMglobals. Eventually, a JVMTI thread gets blocked on this mutex which subsequently holds up all all future JVMTI commands/events.

What mechanism in the VM should be preventing a thread from suspending while holding a global mutex? I do see that before a thread is suspended, the calling thread requests that threads reach consistent state. Is this supposed to be the mechanism?

For example, in CVMobjectInflate(), the calling thread grabs the CVMglobals.syncLock in a GCSafe block. However, while trying to exit the safe code block, it gets blocked in the rendezvous due to an outstanding consistent state request. While it is blocked on the CS mutex/condvar, it gets suspended (still holding the global syncLock).

Any ideas for me on this situation? Is there a problem with the way I'm handling "thread suspend/resume when blocked" in my porting code?

Greg

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
greg80303
Offline
Joined: 2008-07-03
Points: 0

I was able to fix the deadlock we were seeing by adding the CVMglobals.syncLock and CVMglobals.heapLock to the set of mutexes acquired by the CVMthreadSuspendChecker thread in the CHECK state.

Unfortunately it seems like it has also uncovered a bug in the JVM_SuspendThread function (in the "if (!success)" block that handles when the SuspendChecker thread encounters a thread that holds a critical lock). At the end of this block, the calling thread requests that all other threads get consistent, then it calls CVMID_fieldReadLong() to access the CVMExecEnv of the target thread. CVMID_fieldReadLong attempts to make a GC-unsafe transition -- which is not allowed when the caller has requested consistency. I moved the following 2 lines after the call to CVMfieldReadLong():

CVMthreadSuspendConsistentRequest(ee);
CVMsysMicroLockAll(ee);

I have no idea whether or not this is truly safe (or even the correct solution) but it seems to solve our deadlock and we have not seen any stability issues while debugging as of yet.

G

billp
Offline
Joined: 2006-09-19
Points: 0

I think your fix is correct. I'll test it out and try to get it checked in.

bill

billp
Offline
Joined: 2006-09-19
Points: 0

Thread suspension is notoriously dangerous as you can see. There is code in the VM to try and mitigate this as much as possible but will not catch all situations. I'd like to see the GDB output with the backtrace of all the threads when you hit deadlock. If you are consistently hitting this one deadlock with the syncLock then you could probably check that lock in the suspendCheckerThreadFunc function by trying to grab the syncLock. If successful then just release it and suspension will continue. If the lock attempt blocks then the suspension code will timeout, release all threads back to their inconsistent state and try again some milliseconds later. The syncLock will eventually get released by the owner, the lock in the suspendChecker will complete and then get released and suspension will eventually complete.

bill

greg80303
Offline
Joined: 2008-07-03
Points: 0

Here is an example -- There are obviously many more threads running in the system, but I have removed all of the ones not involved in the deadlock. At the bottom of each thread is a summary of why the thread is blocked. The key thread is Thread 9, which is suspended, but holds the global "sync lock".

Thread 44 (Thread 0x9e1cdb70 (LWP 18199)):
#0 0x0036a422 in __kernel_vsyscall ()
#1 0x00759829 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0x00754f4a in _L_lock_832 () from /lib/tls/i686/cmov/libpthread.so.0
#3 0x00754dc8 in __pthread_mutex_lock (mutex=0x9c80a408) at pthread_mutex_lock.c:82
#4 0x028be625 in mpeos_mutexAcquire (mutex=0x9c80a408) at ../RI_Linux/mpeos_sync.c:149
#5 0x075f0f84 in CVMmutexLock (m=0x787501c) at ../../src/mpe/javavm/runtime/sync_md.c:78
#6 0x074f9817 in CVMreentrantMutexLock (ee=0x7875190, rm=0x7875014) at ../../src/share/javavm/runtime/sync.c:157
#7 0x074f9e09 in CVMsysMutexLock (ee=0x7875190, m=0x787500c) at ../../src/share/javavm/runtime/sync.c:367
#8 0x074d8622 in stopTheWorldAndGCSafe (ee=0x7875190, numBytes=20, retryAfterActionCallback=0x74d6d22 , data=0x9e1cb0f4) at ../../src/share/javavm/runtime/gc_common.c:1466
#9 0x074d8a78 in stopTheWorldAndGCWithRetry (ee=0x7875190, numBytes=20, retryAfterActionCallback=0x74d6d22 , retryData=0x9e1cb0f4) at ../../src/share/javavm/runtime/gc_common.c:1553
#10 0x074d6f24 in CVMgcAllocNewInstance (ee=0x7875190, cb=0x76f80e0) at ../../src/share/javavm/runtime/gc_common.c:478
#11 0x07579455 in CVMinternUTF (ee=0x7875190, src=0xa5c5b210 "-") at ../../src/share/javavm/runtime/stringintern.c:779
#12 0x07520305 in CVMreadConstantPool (ee=0x7875190, context=0x9e1cb32c) at ../../src/share/javavm/runtime/classcreate.c:2050
#13 0x0751da05 in CVMclassCreateInternalClass (ee=0x7875190, externalClass=0xa5c59de8 "\312\376\272\276", classSize=2995, loader=0x9c81fa84, classname=0x9e1cb74c "org/cablelabs/impl/manager/application/AppExecQueue", dirNameOrZipFileName=0x0, isRedefine=0) at ../../src/share/javavm/runtime/classcreate.c:1147
#14 0x0751ac2c in CVMdefineClass (ee=0x7875190, name=0x9e1cb74c "org/cablelabs/impl/manager/application/AppExecQueue", loader=0x9c81fa84, buf=0xa5c59de8 "\312\376\272\276", bufLen=2995, pd=0x9c81fa98, isRedefine=0) at ../../src/share/javavm/runtime/classcreate.c:123
#15 0x0754b875 in JVM_DefineClass (env=0x78751bc, name=0x9e1cb74c "org/cablelabs/impl/manager/application/AppExecQueue", loaderRef=0x9c81fa84, buf=0xa5c59de8 "\312\376\272\276", bufLen=2995, pd=0x9c81fa98) at ../../src/share/javavm/runtime/jvm.c:158
#16 0x0758fec9 in Java_java_lang_ClassLoader_defineClass0 (env=0x78751bc, loader=0x9c81fa84, name=0x9c81fa88, data=0x9c81fa8c, offset=0, length=2995, pd=0x9c81fa98) at ../../src/share/native/java/lang/ClassLoader.c:163
#17 0x0760ba43 in args_done () at ../../src/x86/javavm/runtime/invokeNative_i386.S:201
#18 0x078751bc in CVMglobals () from /home/grutz/Projects/CableLabs/RI/JiraSVN/branches/tsbi/ri/ODLSrc/OCAP-1.0/bin/CableLabs/simulator/Linux/debug/env/cvm/bin/libcvm.so
#19 0x9c81fa8c in ?? ()
#20 0x074e797b in CVMinvokeJNIHelper (ee=0x7875190, mb=0x7892a6c) at ../../src/share/javavm/runtime/interpreter.c:4284
#21 0x075bd5f6 in CVMgcUnsafeExecuteJavaMethodJVMTI (ee=0x7875190, mb=0x7892a6c, isStatic=0, isVirtual=0) at ../../src/share/javavm/runtime/executejava_standard.c:3717
#22 0x07544585 in CVMjniInvoke (env=0x78751bc, obj=0x9c821fa0, methodID=0x7892a0c, pushArguments=0x7543a1f , args=0x9e1cc0fc, info=267, retValue=0x9e1cc0f4) at ../../src/share/javavm/runtime/jni_impl.c:2703
#23 0x075449c5 in CVMjniCallObjectMethod (env=0x78751bc, obj=0x9c821fa0, methodID=0x7892a0c) at ../../src/share/javavm/runtime/jni_impl.c:2917
#24 0x07524354 in CVMclassLoadClass0 (ee=0x7875190, loader=0x9c821fa0, classname=0xa5c580d0 "org/cablelabs/impl/manager/application/AppExecQueue", classTypeID=2247) at ../../src/share/javavm/runtime/classload.c:254
#25 0x07523ead in CVMclassLoadClass (ee=0x7875190, loader=0x9c821fa0, classname=0xa5c580d0 "org/cablelabs/impl/manager/application/AppExecQueue", classTypeID=2247) at ../../src/share/javavm/runtime/classload.c:182
#26 0x075275bc in CVMclassLookupFromClassLoader (ee=0x7875190, typeID=2247, name=0xa5c580d0 "org/cablelabs/impl/manager/application/AppExecQueue", init=0, loader=0x9c821fa0, pd=0x9c822e14, throwError=1) at ../../src/share/javavm/runtime/classlookup.c:297
#27 0x07527150 in CVMclassLookupByTypeFromClass (ee=0x7875190, typeID=2247, init=0, fromClass=0xa5c579a0) at ../../src/share/javavm/runtime/classlookup.c:142
#28 0x07512bfc in load_class_local (context=0xa5c56a28, classname=2247) at ../../src/share/javavm/runtime/verifycode.c:612
#29 0x07512c3c in load_class_global (context=0xa5c56a28, classname=2247) at ../../src/share/javavm/runtime/verifycode.c:626
#30 0x0751305d in ID_to_class (context=0xa5c56a28, ID=20) at ../../src/share/javavm/runtime/verifycode.c:745
#31 0x07519235 in object_fullinfo_to_classclass (context=0xa5c56a28, classinfo=1310729) at ../../src/share/javavm/runtime/verifycode.c:3981
#32 0x075190ae in merge_fullinfo_types (context=0xa5c56a28, value=1310729, target=720905, for_assignment=1 '\001') at ../../src/share/javavm/runtime/verifycode.c:3885
#33 0x07518cec in isAssignableTo (context=0xa5c56a28, from=1310729, to=720905) at ../../src/share/javavm/runtime/verifycode.c:3663
#34 0x075162a6 in pop_stack (context=0xa5c56a28, inumber=8, new_stack_info=0x9e1cc5a0) at ../../src/share/javavm/runtime/verifycode.c:2330
#35 0x075150a1 in run_dataflow (context=0xa5c56a28) at ../../src/share/javavm/runtime/verifycode.c:1760
#36 0x0751387e in verify_method (context=0xa5c56a28, cb=0xa5c579a0, mb=0xa5c57c3c) at ../../src/share/javavm/runtime/verifycode.c:1039
#37 0x07513433 in VerifyClass (ee=0x7875190, cb=0xa5c579a0, buffer=0x9e1cc6dc "", len=256, isRedefine=0) at ../../src/share/javavm/runtime/verifycode.c:885
#38 0x0750da7a in CVMclassVerify (ee=0x7875190, cb=0xa5c579a0, isRedefine=0) at ../../src/share/javavm/runtime/classverify.c:170
#39 0x0750ac51 in CVMclassLink (ee=0x7875190, cb=0xa5c579a0, isRedefine=0) at ../../src/share/javavm/runtime/classlink.c:235
#40 0x0751a830 in CVMprivateClassInit (ee=0x7875190, cb=0xa5c579a0, p_mb=0x9e1cc8f8) at ../../src/share/javavm/runtime/classinitialize.c:136
#41 0x0751a6c7 in CVMclassInitNoCRecursion (ee=0x7875190, cb=0xa5c579a0, p_mb=0x9e1cc8f8) at ../../src/share/javavm/runtime/classinitialize.c:72
#42 0x075be390 in CVMgcUnsafeExecuteJavaMethodJVMTI (ee=0x7875190, mb=0x78b28d0, isStatic=0, isVirtual=0) at ../../src/share/javavm/runtime/executejava_standard.c:3912
#43 0x07544585 in CVMjniInvoke (env=0x78751bc, obj=0x9c81f3a4, methodID=0x9c89a824, pushArguments=0x7543a1f , args=0x9e1cd0bc, info=770, retValue=0x0) at ../../src/share/javavm/runtime/jni_impl.c:2703
#44 0x07546127 in CVMjniCallStaticVoidMethod (env=0x78751bc, clazz=0x9c81f3a4, methodID=0x9c89a824) at ../../src/share/javavm/runtime/jni_impl.c:2926
#45 0x0288ba2c in mpe_jvmExecuteMain () at mgr/jvmmgr/jvmmgr.c:335
#46 0x029130d5 in startJvm (ptr=0x0) at ../RI_Common/stack.c:279
#47 0x028bf3dc in threadStart (tls=0xa5c0e9a0) at ../RI_Linux/mpeos_thread.c:284
#48 0x0075280e in start_thread (arg=0x9e1cdb70) at pthread_create.c:300
#49 0x00b2e8de in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
Blocked on "heap lock" mutex (owned by Thread 37)
---------------------------------------------------------------------------------------

Thread 37 (Thread 0x95ca0b70 (LWP 18206)):
#0 0x0036a422 in __kernel_vsyscall ()
#1 0x00759829 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0x00754f4a in _L_lock_832 () from /lib/tls/i686/cmov/libpthread.so.0
#3 0x00754dc8 in __pthread_mutex_lock (mutex=0x9c80a8a0) at pthread_mutex_lock.c:82
#4 0x028be625 in mpeos_mutexAcquire (mutex=0x9c80a8a0) at ../RI_Linux/mpeos_sync.c:149
#5 0x075f0f84 in CVMmutexLock (m=0x7875370) at ../../src/mpe/javavm/runtime/sync_md.c:78
#6 0x074f9817 in CVMreentrantMutexLock (ee=0x9c890570, rm=0x7875368) at ../../src/share/javavm/runtime/sync.c:157
#7 0x074f9e09 in CVMsysMutexLock (ee=0x9c890570, m=0x7875360) at ../../src/share/javavm/runtime/sync.c:367
#8 0x0752b00f in CVMglobalSysMutexesAcquire (ee=0x9c890570) at ../../src/share/javavm/runtime/globals.c:1359
#9 0x074e46e7 in CVMlocksForThreadSuspendAcquire (ee=0x9c890570) at ../../src/share/javavm/runtime/interpreter.c:3041
#10 0x075c2c8a in jvmti_SuspendThread (jvmtienv=0x9c865368, thread=0x9c8907fc) at ../../src/share/javavm/runtime/jvmtiEnv.c:1301
#11 0x075c2e2b in jvmti_SuspendThreadList (jvmtienv=0x9c865368, requestCount=36, requestList=0x9fabc48, results=0x9fabce8) at ../../src/share/javavm/runtime/jvmtiEnv.c:1358
#12 0x024fedec in commonSuspendList (env=0x9c89059c, initCount=39, initList=0x9b4dd78) at ../../src/share/tools/jpda/back/threadControl.c:1313
#13 0x024ff25c in threadControl_suspendAll () at ../../src/share/tools/jpda/back/threadControl.c:1513
#14 0x024f72a9 in suspendWithInvokeEnabled (policy=2 '\002', thread=0x9c820b08) at ../../src/share/tools/jpda/back/eventHelper.c:481
#15 0x024f73c6 in handleReportEventCompositeCommand (env=0x9c89059c, recc=0x9b34c5c) at ../../src/share/tools/jpda/back/eventHelper.c:517
#16 0x024f7710 in handleCommand (env=0x9c89059c, command=0x9b34c50) at ../../src/share/tools/jpda/back/eventHelper.c:598
#17 0x024f78f8 in commandLoop (jvmti_env=0x9c865368, jni_env=0x9c89059c, arg=0x0) at ../../src/share/tools/jpda/back/eventHelper.c:694
#18 0x075c4086 in startFunctionWrapper (arg=0x9c874a00) at ../../src/share/javavm/runtime/jvmtiEnv.c:1865
#19 0x07555f25 in Java_java_lang_Thread_runNative (env=0x9c89059c, thisObj=0x9c8907a8) at ../../src/share/javavm/runtime/jvm.c:1983
#20 0x0760ba43 in args_done () at ../../src/x86/javavm/runtime/invokeNative_i386.S:201
#21 0x9c89059c in ?? ()
#22 0x074e797b in CVMinvokeJNIHelper (ee=0x9c890570, mb=0x789efdc) at ../../src/share/javavm/runtime/interpreter.c:4284
#23 0x075bd5f6 in CVMgcUnsafeExecuteJavaMethodJVMTI (ee=0x9c890570, mb=0x789efdc, isStatic=0, isVirtual=0) at ../../src/share/javavm/runtime/executejava_standard.c:3717
#24 0x07544585 in CVMjniInvoke (env=0x9c89059c, obj=0x9c89073c, methodID=0x789efbc, pushArguments=0x7543a1f , args=0x95ca02fc, info=258, retValue=0x0) at ../../src/share/javavm/runtime/jni_impl.c:2703
#25 0x0754605b in CVMjniCallVoidMethod (env=0x9c89059c, obj=0x9c89073c, methodID=0x789efbc) at ../../src/share/javavm/runtime/jni_impl.c:2926
#26 0x07555e08 in start_func (arg=0x9c890520) at ../../src/share/javavm/runtime/jvm.c:1946
#27 0x028bf3dc in threadStart (tls=0x9c891ef0) at ../RI_Linux/mpeos_thread.c:284
#28 0x0075280e in start_thread (arg=0x95ca0b70) at pthread_create.c:300
#29 0x00b2e8de in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
Blocked on "sync lock" mutex (owned by Thread 9)
----------------------------------------------------------------------------------------

Thread 32 (Thread 0x930feb70 (LWP 18233)):
#0 0x0036a422 in __kernel_vsyscall ()
#1 0x00759829 in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:142
#2 0x00754f4a in _L_lock_832 () from /lib/tls/i686/cmov/libpthread.so.0
#3 0x00754dc8 in __pthread_mutex_lock (mutex=0x9c8252e8) at pthread_mutex_lock.c:82
#4 0x028be625 in mpeos_mutexAcquire (mutex=0x9c8252e8) at ../RI_Linux/mpeos_sync.c:149
#5 0x075f0f84 in CVMmutexLock (m=0x9c8252ac) at ../../src/mpe/javavm/runtime/sync_md.c:78
#6 0x074fa495 in CVMprofiledMonitorEnterUnsafe (self=0x9c8252a4, ee=0x939591b8, doPost=1) at ../../src/share/javavm/runtime/sync.c:576
#7 0x074e8867 in CVMmonEnter (ee=0x939591b8, mon=0x9c8252a0) at ../../src/share/javavm/runtime/objsync.c:345
#8 0x074eb315 in CVMdetLock (ee=0x939591b8, indirectObj=0x788e220) at ../../src/share/javavm/runtime/objsync.c:2040
#9 0x074d853b in stopTheWorldAndGCSafe (ee=0x939591b8, numBytes=24, retryAfterActionCallback=0x74d7125 , data=0x930fda5c) at ../../src/share/javavm/runtime/gc_common.c:1447
#10 0x074d8a78 in stopTheWorldAndGCWithRetry (ee=0x939591b8, numBytes=24, retryAfterActionCallback=0x74d7125 , retryData=0x930fda5c) at ../../src/share/javavm/runtime/gc_common.c:1553
#11 0x074d730a in CVMgcAllocNewArrayWithInstanceSize (ee=0x939591b8, arrayObjectSize=24, arrayCb=0x78632c0, arrayLen=3) at ../../src/share/javavm/runtime/gc_common.c:652
#12 0x074d7228 in CVMgcAllocNewArray (ee=0x939591b8, typeCode=CVM_T_INT, arrayCb=0x78632c0, arrayLen=3) at ../../src/share/javavm/runtime/gc_common.c:603
#13 0x075a8ac4 in CVMnewarrayHelper (ee=0x939591b8, topOfStack=0x939574dc, typeCode=CVM_T_INT) at ../../src/share/javavm/runtime/executejava_standard.c:1042
#14 0x075b6d9a in CVMgcUnsafeExecuteJavaMethodJVMTI (ee=0x939591b8, mb=0x789ee2c, isStatic=0, isVirtual=0) at ../../src/share/javavm/runtime/executejava_standard.c:2733
#15 0x07544585 in CVMjniInvoke (env=0x939591e4, obj=0x93957434, methodID=0x789efbc, pushArguments=0x7543a1f , args=0x930fe2fc, info=258, retValue=0x0) at ../../src/share/javavm/runtime/jni_impl.c:2703
#16 0x0754605b in CVMjniCallVoidMethod (env=0x939591e4, obj=0x93957434, methodID=0x789efbc) at ../../src/share/javavm/runtime/jni_impl.c:2926
#17 0x07555e08 in start_func (arg=0x93953e38) at ../../src/share/javavm/runtime/jvm.c:1946
#18 0x028bf3dc in threadStart (tls=0x9394dd68) at ../RI_Linux/mpeos_thread.c:284
#19 0x0075280e in start_thread (arg=0x930feb70) at pthread_create.c:300
#20 0x00b2e8de in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
Blocked on object monitor owned by Thread 44
-----------------------------------------------------------------------------------------

Thread 9 (Thread 0x876e6b70 (LWP 18256)):
#0 0x0036a422 in __kernel_vsyscall ()
#1 0x00a8c8de in do_sigsuspend (set=0x876e584c) at ../sysdeps/unix/sysv/linux/sigsuspend.c:63
#2 *__GI___sigsuspend (set=0x876e584c) at ../sysdeps/unix/sysv/linux/sigsuspend.c:78
#3 0x028bf80c in signalHandler (sig=10) at ../RI_Linux/mpeos_thread.c:82
#4 0x028bf87a in suspendMe () at ../RI_Linux/mpeos_thread.c:97
#5 0x028be68b in mpeos_mutexAcquire (mutex=0x9c80a940) at ../RI_Linux/mpeos_sync.c:163
#6 0x075f0f84 in CVMmutexLock (m=0x7874ebc) at ../../src/mpe/javavm/runtime/sync_md.c:78
#7 0x074f9817 in CVMreentrantMutexLock (ee=0xa5c486e8, rm=0x7874eb4) at ../../src/share/javavm/runtime/sync.c:157
#8 0x074f9e09 in CVMsysMutexLock (ee=0xa5c486e8, m=0x7874eac) at ../../src/share/javavm/runtime/sync.c:367
#9 0x07529896 in CVMcsRendezvous (ee=0xa5c486e8, cs=0x7874e98, tcs=0xa5c486e8, block=0) at ../../src/share/javavm/runtime/cstates.c:251
#10 0x074e9c85 in CVMobjectInflate (ee=0xa5c486e8, indirectObj=0xa5c48a14) at ../../src/share/javavm/runtime/objsync.c:1162
#11 0x074e9e89 in CVMfastLock (ee=0xa5c486e8, indirectObj=0xa5c48a14) at ../../src/share/javavm/runtime/objsync.c:1225
#12 0x075b717e in CVMgcUnsafeExecuteJavaMethodJVMTI (ee=0xa5c486e8, mb=0x9c8efe08, isStatic=0, isVirtual=0) at ../../src/share/javavm/runtime/executejava_standard.c:2776
#13 0x07544585 in CVMjniInvoke (env=0xa5c48714, obj=0xa5c488b4, methodID=0x789efbc, pushArguments=0x7543a1f , args=0x876e62fc, info=258, retValue=0x0) at ../../src/share/javavm/runtime/jni_impl.c:2703
#14 0x0754605b in CVMjniCallVoidMethod (env=0xa5c48714, obj=0xa5c488b4, methodID=0x789efbc) at ../../src/share/javavm/runtime/jni_impl.c:2926
#15 0x07555e08 in start_func (arg=0xa5c38b78) at ../../src/share/javavm/runtime/jvm.c:1946
#16 0x028bf3dc in threadStart (tls=0xa5c4a068) at ../RI_Linux/mpeos_thread.c:284
#17 0x0075280e in start_thread (arg=0x876e6b70) at pthread_create.c:300
#18 0x00b2e8de in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
Holds the "sync lock" mutex and is suspended
--------------------------------------------------------------------------------------------