Skip to main content

GRIZZLY0023: Interrupting idle Thread issue

8 replies [Last post]
kthcochrane
Offline
Joined: 2012-10-03

Hello,

We run a production environment with over 30 Glassfish servers, providing web services that are invoked by our client applications. We have recently upgraded our servers to Glassfish 3, previously we were running Glassfish 2. Since doing this we have been experiencing an intermittent issue where the server logs the error containing:

GRIZZLY0023: Interrupting idle Thread: http-thread-pool....

It repeatedly logs this every 2ms and the server no longer serves any requests. It is not possible to stop the domain, to resolve the problem the JVM has to be forced to close and then the domain started again. The issue mainly seems to happen overnight when we suspect the servers are experiencing no load. Some nights none of the servers experience the issue but this is rare (last night 3 servers failed).

We are running JVM 1.6.0_33, Glassfish 3.1.2.2 (build 5) and the OS is Linux centos (2.6.32-279.2.1.el6.x86_64) which is running in a virtual environment. There have been various threads on forums about this issue and we have tried the patches suggested but with little success. On Glassfish 2 we had to switch off the Grizzly connector and use the Coyote one because of an epoll selector bug but there does not seem to be this option on v3.

Perhaps someone could be so kind as to suggest the areas where we can enable logging and possibly help determine the cause? An obvious starting place to enable logging would appear to be com.sun.grizzly.config.GrizzlyServiceListener.

Thank you.

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
najmi
Offline
Joined: 2003-06-17

I have recently had a similar problem with Glassfish 3.1.2. jstack showed that the issue was a deadlock between two threads. I do not think the deadlock is caused by my webapp code and suspect it is one of my dependencies or perhaps Glassfish itself.

The message was titled "Glassfish 3.1.2 deadlock involving org.glassfish.web.loader.WebappClassLoader" and was posted to users@glassfish.java.net list on Feb 26, 2013. Its not yet visible in the archives at:

http://java.net/projects/glassfish/lists/users/archive

I have just created the following issues to track this:

http://java.net/jira/browse/GLASSFISH-19731

oleksiys
Offline
Joined: 2006-01-25

Hi,

have you tried the grizzly-http.jar from this thread [1] (for Glassfish 3.1.2.2)?

Thanks.

WBR,
Alexey.

[1] http://forums.java.net/forum/topic/glassfish/glassfish/what-causes-grizz...

DavidHutchison
Offline
Joined: 2012-10-08

We have applied both the patch at the thread you mentioned here, and the one associated with glassfish issue 16217, which is suggested in this forum post. The issue still occured on friday evening on one of the servers we had patched.

Do you have any other suggestions / require any more information?

Thanks,

David

oleksiys
Offline
Joined: 2006-01-25

Hi David,

just to make sure, after you applied grizzly-http.jar you still see lots of "GRIZZLY0023: Interrupting idle Thread issue" messages flooding the log, or it's just couple of them there?

Cause the message itself is fine and most probably it signals about some task being executed for a long time (15 mins by default), which is most cases means some bug in the application code. If you have long-lasting tasks and it's normal for your app to occupy the thread for such a long time. you can disable this logic by setting request-timeout=-1 on the appropriate http-listener, like [1].

Hope this will help.

WBR,
Alexey.

[1] asadmin set server-config.network-config.protocols.protocol.http-listener-1.http.request-timeout-seconds=-1

DavidHutchison
Offline
Joined: 2012-10-08

After checking, it no longer floods the log with the message. On Friday one of the four servers we applied the patch to became unresponsive during the night. The patch has made a change, as there were only four instances of the message, when before the log would have been flooded with them.

We believe there should be no long running proceses using the HTTP threads at the time the servers are reporting it, as it is outside office hours. We do have at least one scheduled task on a timer that runs around the time these messages appear, but these tasks have been disabled in the past to check they were not related to the failure.

I have had a check of the log files around the failure, and the failures last night and on friday night were all immediately proceeded by the following line:

[#|2012-10-05T18:17:13.895+0100|INFO|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.gjc.spi|_ThreadID=45;_ThreadName=Thread-2;|RAR5106 : AutoCommit based validation detected invalid connection. Set resource-adapter log-level to FINE for exception stack trace|#]

[#|2012-10-05T18:36:11.415+0100|WARNING|glassfish3.1.2|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=13;_ThreadName=Thread-2;|GRIZZLY0023: Interrupting idle Thread: http-thread-pool-2030(4).|#]

Generally it appears the servers without this log statement continue to work, but this may just be coincidence. I have only checked a random sample of working servers.

Do you think these could be related?

Thanks,

David

oleksiys
Offline
Joined: 2006-01-25

Hi David,

thank you for the info.
Looks like you really have unexpected long-lasting task(s) running, which occupy all the http-listener threads. The log message just signals about that.

Once server becomes unresponsive next time - pls. take a thread dump using "jstack" utility. Most probably you'll see all the http-listener worker threads blocked at some operation... according to [1] log, I'd think it's DB related.

Pls. share the threads dump once you have it.

Thanks.

WBR,
Alexey.

[1] [#|2012-10-05T18:17:13.895+0100|INFO|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.gjc.spi|_ThreadID=45;_ThreadName=Thread-2;|RAR5106 : AutoCommit based validation detected invalid connection. Set resource-adapter log-level to FINE for exception stack trace|#]

DavidHutchison
Offline
Joined: 2012-10-08

None of the patched servers had the issue last night, but if it is just a long running process this thread dump should at least show it. I can see threads waiting, but I can't obviously see what they are blocked on in the trace. I'm probably just missing something obvious. As before, the "RAR5106 : AutoCommit based validation detected invalid connection." message is displayed before the GRIZZLY0023 warning messages.

Thanks,

David.

oleksiys
Offline
Joined: 2006-01-25

Hi David,

in your attachment pls. find threads starting w/ "http-thread-pool-2030"
You'll find all of them locked when trying to get resource (connection from a pool) [1].
Most probably connection is not getting available during 15mins, so Grizzly interrupts the waiting thread.
As I mentioned in one of the emails, if this is expected situation - pls. just disable request-timeout (by setting -1).

Thanks.

WBR,
Alexey.

[1]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007602adbc8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:941)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1261)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:594)
at com.sun.enterprise.resource.pool.datastructure.RWLockDataStructure.getResource(RWLockDataStructure.java:116)
at com.sun.enterprise.resource.pool.ConnectionPool.getResourceFromPool(ConnectionPool.java:716)
at com.sun.enterprise.resource.pool.ConnectionPool.getUnenlistedResource(ConnectionPool.java:632)
at com.sun.enterprise.resource.pool.ConnectionPool.internalGetResource(ConnectionPool.java:526)
at com.sun.enterprise.resource.pool.ConnectionPool.getResource(ConnectionPool.java:381)
at com.sun.enterprise.resource.pool.PoolManagerImpl.getResourceFromPool(PoolManagerImpl.java:245)
at com.sun.enterprise.resource.pool.PoolManagerImpl.getResource(PoolManagerImpl.java:170)
at com.sun.enterprise.connectors.ConnectionManagerImpl.getResource(ConnectionManagerImpl.java:338)
at com.sun.enterprise.connectors.ConnectionManagerImpl.internalGetConnection(ConnectionManagerImpl.java:301)
at com.sun.enterprise.connectors.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:236)
at com.sun.enterprise.connectors.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:165)
at com.sun.enterprise.connectors.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:160)
at com.sun.gjc.spi.base.DataSource.getConnection(DataSource.java:113)