Skip to main content

Cluster SynchronizationException

23 replies [Last post]
gussie
Offline
Joined: 2008-07-16

Hello,

We have been evaluating Glassfish for one of our current projects and have been kicking
the tires quite extensively. We have worked through some minor annoyances, but in
general so far so good.

The most appealing feature of Glassfish for our current needs is centralized cluster
management and the automated configuration of the load balancer.

Unfortunately we are experiencing problems when starting an instance in a cluster.

Our current test server configuration is:
Centos 5.1 running in VMWare Server (SELinux and ipv6 are off).
JDK 1.6u6
Glassfish v2ur2-b04-linux running as root
Network identity by DHCP.
System clocks are synchronized with the host using vmware-tools.

The DAS and Node Agents are running in separate servers on the same subnet.
There is nothing in-between to interfere with traffic.

The problem:

When cold starting a Node Agent which is configured to automatically start all
instances synchronization succeeds, albeit extremely slowly – up to 5 minutes.

When starting an instance manually, through the console or by command line,
synchronization blocks indefinitely.

On the Node Agent side we see the following in the instance log files after a period of time:

[#|2008-07-17T09:21:46.419+1000|INFO|sun-appserver9.1|javax.ee.enterprise.system.tools.synchronization|_ThreadID=11;
_ThreadName=sync-1;|SYNC014: Unable to update synchronization timestamp.
com.sun.enterprise.ee.synchronization.SynchronizationException: Error while updating timestamp for synch request: ${com.sun.aas.instanceRoot}/applications/
at com.sun.enterprise.ee.synchronization.TimestampCommand.execute(TimestampCommand.java:153)
at com.sun.enterprise.ee.synchronization.BaseRequestMediator.commit(BaseRequestMediator.java:151)
at com.sun.enterprise.ee.synchronization.BaseRequestMediator.run(BaseRequestMediator.java:126)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.NullPointerException
at com.sun.enterprise.ee.synchronization.TimestampCommand.execute(TimestampCommand.java:85)
... 3 more
|#]

There are no exceptions or warnings in the DAS logs.

The problem was also reported in this tread - similar configuration:
http://forums.java.net/jive/thread.jspa?threadID=36920&tstart=15

In an attempt to troubleshoot this we have tried the following:
1. Using Glassfish version v2.1-b24d-linux – same issue
2. Monitoring traffic between servers – nothing revealing
3. Following the tuning guidelines in this document:
http://docs.sun.com/app/docs/doc/819-3681/abeir?a=view
4. Following the numerous oblique references to loopback address issues.
5. Lots of Googling and forum trawling.
6. Help!

Could it be an issue with Centos / VMWare, the JDK, some OS / GF configuration switch?

Any ideas?

Thanks.

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
gussie
Offline
Joined: 2008-07-16

Daniel & Manfred,

Thanks for your continued help and for the recent suggestion.

First the good news:

With a completely clean installation of the DAS and node agent (each on separate
servers – same subnet) and with the INSTANCE-SYNC-JVM-OPTIONS property set
to -Xmx256m we no longer see the OutOfMemoryException.

The bad news is that the node agent failed to start, timing out after approximately 20 minutes.
The application directory for the instance contains 2 out of the 9 war files and the
SynchronizationMain process is still running. There are no exceptions in the log files
on the instance side.

The logs on the DAS side contain entries of the form:

[#|2008-07-22T12:48:45.289+1000|INFO|sun-appserver9.1|javax.enterprise.system.tools.admin|_ThreadID=15;_ThreadName=httpWorkerThread-4848-1;/tmp/s1astempdomain1server-670762051/webapp-a.war;|ADM1006:Uploading the file to:[/tmp/s1astempdomain1server-670762051/webapp-a.war]|#]

[#|2008-07-22T12:48:53.079+1000|WARNING|sun-appserver9.1|javax.ee.enterprise.system.tools.admin|_ThreadID=15;_ThreadName=httpWorkerThread-4848-1;No route to host;_RequestID=d0789af5-f3b7-42b5-8acb-c370347ff6b7;|EEADM0221: Auto Apply changes could not be done [No route to host]|#]

[#|2008-07-22T12:48:56.113+1000|WARNING|sun-appserver9.1|javax.ee.enterprise.system.tools.admin|_ThreadID=42;_ThreadName=event-handler-0;
web-01;_RequestID=cee91dde-4a7c-4a7d-900c-75e9e8c69228;|EEADM0068: Instance web-01 is not reachable.|#]

[#|2008-07-22T12:50:54.673+1000|INFO|sun-appserver9.1|javax.ee.enterprise.system.tools.synchronization|_ThreadID=44;_ThreadName=RMI TCP Connection(11)-192.168.100.31;Tue Jul 22 12:50:54 EST 2008;na-vm1;|SYNC005: Received synchronization request at time [Tue Jul 22 12:50:54 EST 2008] from [na-vm1].|#]

I suspect that we are now hitting some other configuration issue that will require further
troubleshooting - it may be a symptom of the numerous configuration changes that we
have made along the way in order to track down the original issue.

I will try once more with clean VMs tomorrow.

Thanks.
G.

gussie
Offline
Joined: 2008-07-16

Well it looks as though we will have move to plan-b.

After trying the following versions of Glassfish:
V2 UR2 b04 Promoted Build
Build 24d 01-April-08
Build 41 10-July-08

We currently cannot get past the “No route to host “ errors.
The node agent manages to pull 2 out of the nine war files in the cluster and times
out after 20 minutes or so.

As a prelude to plan-b we have deployed our nine war files into a small cluster using
the farm capabilities in JBoss on several nodes with out any problems – yet ;)

Thanks for helping get though some of the issues. We may check out other
Glassfish versions in the future.

Manfred Riem

Hi there,

It seems that you have nailed down the problem. We now need a Glassfish engineer
answer the question about the deployment size.

If there is a maximum the entire Glassfish user base needs to know about it.
Eduardo: can you have someone shed some light on it?

About your java.net user account see http://www.java.net/ and send a password
Reminder that should send it to your primary email account ;).

Manfred

-----Original Message-----
From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
Sent: Sunday, July 20, 2008 11:20 PM
To: users@glassfish.dev.java.net
Subject: Re: RE: RE: RE: RE: RE: RE: RE: Cluster SynchronizationException

I may have narrowed down the problem or at least identified the steps to reproduce it.

The the OutOfMemoryError occurs when starting a new node instance in a cluster
where the total size of the deployed artefacts (in this case several war files) exceeds
30MB.

I have executed the following steps several times and consistently see OutOfMemoryError:

1. From the DAS console, create a new cluster.
2. Create a new node agent on a different server from the DAS. Don't start it yet.
3. From the DAS console or asadmin, add a new instance to the cluster on the
node agent created in step 2.
4. Using asadmin, deploy over 30MB worth of war files into the cluster.
5. Start the node agent.

I'm not sure if this is repeatable under MS Windows.

I have tried a few dummy deployments where the total size of war files was just under
30MB. The node agent started, created and synchronized the instance directory
successfully. I was also able to stop and start the instance successfully from the DAS.

When the total deployment size was just over 30MB the node agent failed to start with
an OutOfMemoryError.

So introducing a new node to an established cluster fails in this scenario or at least it does for me.

Testing this the other way around by incrementally deploying several applications in
to a cluster where the instances are already running it is possible to go beyond
a total deployment size of 30MB. I have not tested the deployment of a single 30+MB
war file as yet.

I'm still not sure if this is causing the original “hanging synchronization” problem when
starting and stopping an instance. The reason why I say this is that I tested the case
where just over 30MB of war files was incrementally deployed into a node and then I
stopped and started the instance. It took a while but completed successfully.

However, when the total size of incrementally deployed war files greatly exceeds
30MB (say 50MB) the DAS hangs when starting the instance.

Some of the war files that we have are quite large – a result of transitive
dependencies in maven. We may be able to prune things down a bit, but if this
there is a deployment size limitation, this just delays the problem.

Is there a limit to the total deployment size that you know of and if so is there
some configuration property that can be set somewhere to at least work around
the problem in the short term?

By the way, would you still like to see the configurations files?

Also I am new to the dev.net site and still finding my way around. I noticed in my
profile that I also have a dev.java.net email account name, but no way of accessing it.
Is this something that has to be activated?

Thanks for helping and being patient.
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287971

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

Daniel Adelhardt

Hi,

the problem may (I had similar issues in the past) be caused by the fact
that the node agent spawns a JVM for the sync process and uses the
default Xmx heap settings for that jvm. If you specify a larger setting
the issues should go away. So for syncing larger apps in a clustered or
distributed environment you should apply the following settings for each
node agent:

asadmin set domain.node-agent..property.INSTANCE-SYNC-JVM-OPTIONS="-Xmx256m"

Using -Xmx256m for 256m max Heap is just a suggestion over the jvm
default. In my deployments that was sufficient to also sync apps with
over 50m .ear files.

Please have a look at:
http://docs.sun.com/app/docs/doc/819-3679/abdkk?a=view

Daniel

Manfred Riem schrieb:
> Hi there,
>
> It seems that you have nailed down the problem. We now need a Glassfish engineer
> answer the question about the deployment size.
>
> If there is a maximum the entire Glassfish user base needs to know about it.
> Eduardo: can you have someone shed some light on it?
>
> About your java.net user account see http://www.java.net/ and send a password
> Reminder that should send it to your primary email account ;).
>
> Manfred
>
> -----Original Message-----
> From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
> Sent: Sunday, July 20, 2008 11:20 PM
> To: users@glassfish.dev.java.net
> Subject: Re: RE: RE: RE: RE: RE: RE: RE: Cluster SynchronizationException
>
> I may have narrowed down the problem or at least identified the steps to reproduce it.
>
> The the OutOfMemoryError occurs when starting a new node instance in a cluster
> where the total size of the deployed artefacts (in this case several war files) exceeds
> 30MB.
>
> I have executed the following steps several times and consistently see OutOfMemoryError:
>
> 1. From the DAS console, create a new cluster.
> 2. Create a new node agent on a different server from the DAS. Don't start it yet.
> 3. From the DAS console or asadmin, add a new instance to the cluster on the
> node agent created in step 2.
> 4. Using asadmin, deploy over 30MB worth of war files into the cluster.
> 5. Start the node agent.
>
> I'm not sure if this is repeatable under MS Windows.
>
> I have tried a few dummy deployments where the total size of war files was just under
> 30MB. The node agent started, created and synchronized the instance directory
> successfully. I was also able to stop and start the instance successfully from the DAS.
>
> When the total deployment size was just over 30MB the node agent failed to start with
> an OutOfMemoryError.
>
> So introducing a new node to an established cluster fails in this scenario or at least it does for me.
>
> Testing this the other way around by incrementally deploying several applications in
> to a cluster where the instances are already running it is possible to go beyond
> a total deployment size of 30MB. I have not tested the deployment of a single 30+MB
> war file as yet.
>
> I'm still not sure if this is causing the original “hanging synchronization” problem when
> starting and stopping an instance. The reason why I say this is that I tested the case
> where just over 30MB of war files was incrementally deployed into a node and then I
> stopped and started the instance. It took a while but completed successfully.
>
> However, when the total size of incrementally deployed war files greatly exceeds
> 30MB (say 50MB) the DAS hangs when starting the instance.
>
> Some of the war files that we have are quite large – a result of transitive
> dependencies in maven. We may be able to prune things down a bit, but if this
> there is a deployment size limitation, this just delays the problem.
>
> Is there a limit to the total deployment size that you know of and if so is there
> some configuration property that can be set somewhere to at least work around
> the problem in the short term?
>
> By the way, would you still like to see the configurations files?
>
> Also I am new to the dev.net site and still finding my way around. I noticed in my
> profile that I also have a dev.java.net email account name, but no way of accessing it.
> Is this something that has to be activated?
>
> Thanks for helping and being patient.
> [Message sent by forum member 'gussie' (gussie)]
>
> http://forums.java.net/jive/thread.jspa?messageID=287971
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>
>

--
========================================================================
Daniel Adelhardt Tel: +49 89 460082443
Software Architect Mobile: +49 172 8417283
Sun Microsystems GmbH Email: daniel.adelhardt@sun.com
Sonnenallee 1
D-85551 Kirchheim-Heimstetten

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

Manfred Riem

Hi Daniel,

Thanks, now people can find it easier ;)

Manfred

-----Original Message-----
From: Daniel.Adelhardt@Sun.COM [mailto:Daniel.Adelhardt@Sun.COM]
Sent: Monday, July 21, 2008 8:11 AM
To: users@glassfish.dev.java.net
Subject: Re: Maximum Deployment Size? was (RE: Cluster SynchronizationException)

Hi,

the problem may (I had similar issues in the past) be caused by the fact
that the node agent spawns a JVM for the sync process and uses the
default Xmx heap settings for that jvm. If you specify a larger setting
the issues should go away. So for syncing larger apps in a clustered or
distributed environment you should apply the following settings for each
node agent:

asadmin set domain.node-agent..property.INSTANCE-SYNC-JVM-OPTIONS="-Xmx256m"

Using -Xmx256m for 256m max Heap is just a suggestion over the jvm
default. In my deployments that was sufficient to also sync apps with
over 50m .ear files.

Please have a look at:
http://docs.sun.com/app/docs/doc/819-3679/abdkk?a=view

Daniel

Manfred Riem schrieb:
> Hi there,
>
> It seems that you have nailed down the problem. We now need a Glassfish engineer
> answer the question about the deployment size.
>
> If there is a maximum the entire Glassfish user base needs to know about it.
> Eduardo: can you have someone shed some light on it?
>
> About your java.net user account see http://www.java.net/ and send a password
> Reminder that should send it to your primary email account ;).
>
> Manfred
>
> -----Original Message-----
> From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
> Sent: Sunday, July 20, 2008 11:20 PM
> To: users@glassfish.dev.java.net
> Subject: Re: RE: RE: RE: RE: RE: RE: RE: Cluster SynchronizationException
>
> I may have narrowed down the problem or at least identified the steps to reproduce it.
>
> The the OutOfMemoryError occurs when starting a new node instance in a cluster
> where the total size of the deployed artefacts (in this case several war files) exceeds
> 30MB.
>
> I have executed the following steps several times and consistently see OutOfMemoryError:
>
> 1. From the DAS console, create a new cluster.
> 2. Create a new node agent on a different server from the DAS. Don't start it yet.
> 3. From the DAS console or asadmin, add a new instance to the cluster on the
> node agent created in step 2.
> 4. Using asadmin, deploy over 30MB worth of war files into the cluster.
> 5. Start the node agent.
>
> I'm not sure if this is repeatable under MS Windows.
>
> I have tried a few dummy deployments where the total size of war files was just under
> 30MB. The node agent started, created and synchronized the instance directory
> successfully. I was also able to stop and start the instance successfully from the DAS.
>
> When the total deployment size was just over 30MB the node agent failed to start with
> an OutOfMemoryError.
>
> So introducing a new node to an established cluster fails in this scenario or at least it does for me.
>
> Testing this the other way around by incrementally deploying several applications in
> to a cluster where the instances are already running it is possible to go beyond
> a total deployment size of 30MB. I have not tested the deployment of a single 30+MB
> war file as yet.
>
> I'm still not sure if this is causing the original “hanging synchronization” problem when
> starting and stopping an instance. The reason why I say this is that I tested the case
> where just over 30MB of war files was incrementally deployed into a node and then I
> stopped and started the instance. It took a while but completed successfully.
>
> However, when the total size of incrementally deployed war files greatly exceeds
> 30MB (say 50MB) the DAS hangs when starting the instance.
>
> Some of the war files that we have are quite large – a result of transitive
> dependencies in maven. We may be able to prune things down a bit, but if this
> there is a deployment size limitation, this just delays the problem.
>
> Is there a limit to the total deployment size that you know of and if so is there
> some configuration property that can be set somewhere to at least work around
> the problem in the short term?
>
> By the way, would you still like to see the configurations files?
>
> Also I am new to the dev.net site and still finding my way around. I noticed in my
> profile that I also have a dev.java.net email account name, but no way of accessing it.
> Is this something that has to be activated?
>
> Thanks for helping and being patient.
> [Message sent by forum member 'gussie' (gussie)]
>
> http://forums.java.net/jive/thread.jspa?messageID=287971
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>
>

--
========================================================================
Daniel Adelhardt Tel: +49 89 460082443
Software Architect Mobile: +49 172 8417283
Sun Microsystems GmbH Email: daniel.adelhardt@sun.com
Sonnenallee 1
D-85551 Kirchheim-Heimstetten

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

gussie
Offline
Joined: 2008-07-16

I have just installed version 2.1-b24d A.K.A "Sun Java System Application Server 9.1.1 (build b24d-fcs)" (DAS and Node Agents) and get the same exception:

Exception in thread "sync-1" java.lang.OutOfMemoryError: Java heap space
at sun.net.www.http.ChunkedInputStream.processRaw(ChunkedInputStream.java:336)

I am now also host each server instance in the same VMWare Server ie on the same guest.

Not sure what to do at this point.

G.

gussie
Offline
Joined: 2008-07-16

DNS appears to configured correctly on all nodes.
Forward and reverse lookup is correct on each as is the hostname.

The SynchronizationMain process on the Node Agent side has established a connection with the DAS:

tcp 206656 0 ::ffff:192.168.100.31:33078 ::ffff:192.168.100.33:4848 ESTABLISHED 3598/java

These addresses are correct.

Is there a way to debug this or set finer logging levels to figure out what is causing it to block?

Thanks.

gussie
Offline
Joined: 2008-07-16

Manfred,

Thanks for the quick response. I will look into this and report back.

G.

Manfred Riem

Most of the issues I have experienced where related to faulty
DNS setup. Have you checked that on DAS en nodes to be sure?

Manfred

-----Original Message-----
From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
Sent: Wednesday, July 16, 2008 7:14 PM
To: users@glassfish.dev.java.net
Subject: Cluster SynchronizationException

Hello,

We have been evaluating Glassfish for one of our current projects and have been kicking
the tires quite extensively. We have worked through some minor annoyances, but in
general so far so good.

The most appealing feature of Glassfish for our current needs is centralized cluster
management and the automated configuration of the load balancer.

Unfortunately we are experiencing problems when starting an instance in a cluster.

Our current test server configuration is:
Centos 5.1 running in VMWare Server (SELinux and ipv6 are off).
JDK 1.6u6
Glassfish v2ur2-b04-linux running as root
Network identity by DHCP.
System clocks are synchronized with the host using vmware-tools.

The DAS and Node Agents are running in separate servers on the same subnet.
There is nothing in-between to interfere with traffic.

The problem:

When cold starting a Node Agent which is configured to automatically start all
instances synchronization succeeds, albeit extremely slowly – up to 5 minutes.

When starting an instance manually, through the console or by command line,
synchronization blocks indefinitely.

On the Node Agent side we see the following in the instance log files after a period of time:

[#|2008-07-17T09:21:46.419+1000|INFO|sun-appserver9.1|javax.ee.enterprise.system.tools.synchronization|_ThreadID=11;
_ThreadName=sync-1;|SYNC014: Unable to update synchronization timestamp.
com.sun.enterprise.ee.synchronization.SynchronizationException: Error while updating timestamp for synch request: ${com.sun.aas.instanceRoot}/applications/
at com.sun.enterprise.ee.synchronization.TimestampCommand.execute(TimestampCommand.java:153)
at com.sun.enterprise.ee.synchronization.BaseRequestMediator.commit(BaseRequestMediator.java:151)
at com.sun.enterprise.ee.synchronization.BaseRequestMediator.run(BaseRequestMediator.java:126)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.NullPointerException
at com.sun.enterprise.ee.synchronization.TimestampCommand.execute(TimestampCommand.java:85)
... 3 more
|#]

There are no exceptions or warnings in the DAS logs.

The problem was also reported in this tread - similar configuration:
http://forums.java.net/jive/thread.jspa?threadID=36920&tstart=15

In an attempt to troubleshoot this we have tried the following:
1. Using Glassfish version v2.1-b24d-linux – same issue
2. Monitoring traffic between servers – nothing revealing
3. Following the tuning guidelines in this document:
http://docs.sun.com/app/docs/doc/819-3681/abeir?a=view
4. Following the numerous oblique references to loopback address issues.
5. Lots of Googling and forum trawling.
6. Help!

Could it be an issue with Centos / VMWare, the JDK, some OS / GF configuration switch?

Any ideas?

Thanks.
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287180

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

gussie
Offline
Joined: 2008-07-16

I have taken DNS out of the equation altogether by:
1. adding entries for each server in /etc/hosts.
2. stopping nscd
3. moving /etc/resolv.conf out of the way.

The SynchronizationMain has established a connection to the DAS but still hangs.

Is there anything else that could possibly cause this to hang?

Thanks Again.

Manfred Riem

I know this is silly, but have you tried to do it using IPv4?

Manfred

-----Original Message-----
From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
Sent: Wednesday, July 16, 2008 11:23 PM
To: users@glassfish.dev.java.net
Subject: Re: RE: Cluster SynchronizationException

I have taken DNS out of the equation altogether by:
1. adding entries for each server in /etc/hosts.
2. stopping nscd
3. moving /etc/resolv.conf out of the way.

The SynchronizationMain has established a connection to the DAS but still hangs.

Is there anything else that could possibly cause this to hang?

Thanks Again.
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287212

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

gussie
Offline
Joined: 2008-07-16

Not silly at all.

I have turned off ipv6 support on all servers. The SynchronizationMain establishes a connection with the DAS, but still hangs:

tcp 208312 0 192.168.100.31:11777 192.168.100.33:4848 ESTABLISHED 3691/java

After shutting down the NodeAgent I also deleted the instance directory to force a complete rebuild (probably not a good move).

This time I get the following exception:

Exception in thread "sync-1" java.lang.OutOfMemoryError: Java heap space
at sun.net.www.http.ChunkedInputStream.processRaw(ChunkedInputStream.java:336)
at sun.net.www.http.ChunkedInputStream.readAheadNonBlocking(ChunkedInputStream.java:493)
at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:584)
at sun.net.www.http.ChunkedInputStream.hurry(ChunkedInputStream.java:734)
at sun.net.www.http.ChunkedInputStream.closeUnderlying(ChunkedInputStream.java:198)
at sun.net.www.http.ChunkedInputStream.close(ChunkedInputStream.java:715)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:2458)
at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at com.sun.enterprise.ee.synchronization.http.HttpUnzipper.writeZip(HttpUnzipper.java:122)
at com.sun.enterprise.ee.synchronization.http.HttpGetCommand.execute(HttpGetCommand.java:126)
at com.sun.enterprise.ee.synchronization.BaseRequestMediator.execute(BaseRequestMediator.java:90)
at com.sun.enterprise.ee.synchronization.BaseRequestMediator.run(BaseRequestMediator.java:107)
at java.lang.Thread.run(Thread.java:619)

The max heap for the domain is set to 768MB. There is 2GB allocated to the VM.
Does this look reasonable or should I increase the heap further?

I think the next thing to try is start with a clean install of the DAS and node agent.

Would there be any merit in moving to a 2.1 build?

Thanks for taking the time.

Manfred Riem

Hi there,

Which version are you running exactly?

Manfred

-----Original Message-----
From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
Sent: Thursday, July 17, 2008 1:43 AM
To: users@glassfish.dev.java.net
Subject: Re: RE: RE: Cluster SynchronizationException

Not silly at all.

I have turned off ipv6 support on all servers. The SynchronizationMain establishes a connection with the DAS, but still hangs:

tcp 208312 0 192.168.100.31:11777 192.168.100.33:4848 ESTABLISHED 3691/java

After shutting down the NodeAgent I also deleted the instance directory to force a complete rebuild (probably not a good move).

This time I get the following exception:

Exception in thread "sync-1" java.lang.OutOfMemoryError: Java heap space
at sun.net.www.http.ChunkedInputStream.processRaw(ChunkedInputStream.java:336)
at sun.net.www.http.ChunkedInputStream.readAheadNonBlocking(ChunkedInputStream.java:493)
at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:584)
at sun.net.www.http.ChunkedInputStream.hurry(ChunkedInputStream.java:734)
at sun.net.www.http.ChunkedInputStream.closeUnderlying(ChunkedInputStream.java:198)
at sun.net.www.http.ChunkedInputStream.close(ChunkedInputStream.java:715)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:2458)
at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
at java.io.FilterInputStream.close(FilterInputStream.java:155)
at com.sun.enterprise.ee.synchronization.http.HttpUnzipper.writeZip(HttpUnzipper.java:122)
at com.sun.enterprise.ee.synchronization.http.HttpGetCommand.execute(HttpGetCommand.java:126)
at com.sun.enterprise.ee.synchronization.BaseRequestMediator.execute(BaseRequestMediator.java:90)
at com.sun.enterprise.ee.synchronization.BaseRequestMediator.run(BaseRequestMediator.java:107)
at java.lang.Thread.run(Thread.java:619)

The max heap for the domain is set to 768MB. There is 2GB allocated to the VM.
Does this look reasonable or should I increase the heap further?

I think the next thing to try is start with a clean install of the DAS and node agent.

Would there be any merit in moving to a 2.1 build?

Thanks for taking the time.
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287234

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

gussie
Offline
Joined: 2008-07-16

Hi Manfred,

We are running "Sun Java System Application Server 9.1_02 (build b04-fcs)"

Thanks
G.

Manfred Riem

How much memory did you allocate for your VMWare server instance?

-----Original Message-----
From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
Sent: Thursday, July 17, 2008 5:26 PM
To: users@glassfish.dev.java.net
Subject: Re: RE: RE: RE: Cluster SynchronizationException

Hi Manfred,

We are running "Sun Java System Application Server 9.1_02 (build b04-fcs)"

Thanks
G.
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287435

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

gussie
Offline
Joined: 2008-07-16

2GB

Manfred Riem

Can you send me your domain.xml and asenv.conf for the DAS? And
the das.properties, nodeagent.properties and domain.xml for the
nodeagent and for the domain.xml for the nodes? I am curious
what they look like and I can compare them with a working
setup.

Manfred

-----Original Message-----
From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
Sent: Thursday, July 17, 2008 11:13 PM
To: users@glassfish.dev.java.net
Subject: Re: RE: RE: RE: RE: Cluster SynchronizationException

2GB
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287466

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

gussie
Offline
Joined: 2008-07-16

Sure. Can I email this to you?

Manfred Riem

Yes

-----Original Message-----
From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
Sent: Friday, July 18, 2008 1:29 AM
To: users@glassfish.dev.java.net
Subject: Re: RE: RE: RE: RE: RE: Cluster SynchronizationException

Sure. Can I email this to you?
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287498

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

gussie
Offline
Joined: 2008-07-16

Ok. Which address should I send it to?

Thanks.

Manfred Riem

mriem@dev.java.net

-----Original Message-----
From: glassfish@javadesktop.org [mailto:glassfish@javadesktop.org]
Sent: Sunday, July 20, 2008 6:07 PM
To: users@glassfish.dev.java.net
Subject: Re: RE: RE: RE: RE: RE: RE: Cluster SynchronizationException

Ok. Which address should I send it to?

Thanks.
[Message sent by forum member 'gussie' (gussie)]

http://forums.java.net/jive/thread.jspa?messageID=287943

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net

gussie
Offline
Joined: 2008-07-16

I may have narrowed down the problem or at least identified the steps to reproduce it.

The the OutOfMemoryError occurs when starting a new node instance in a cluster
where the total size of the deployed artefacts (in this case several war files) exceeds
30MB.

I have executed the following steps several times and consistently see OutOfMemoryError:

1. From the DAS console, create a new cluster.
2. Create a new node agent on a different server from the DAS. Don't start it yet.
3. From the DAS console or asadmin, add a new instance to the cluster on the
node agent created in step 2.
4. Using asadmin, deploy over 30MB worth of war files into the cluster.
5. Start the node agent.

I'm not sure if this is repeatable under MS Windows.

I have tried a few dummy deployments where the total size of war files was just under
30MB. The node agent started, created and synchronized the instance directory
successfully. I was also able to stop and start the instance successfully from the DAS.

When the total deployment size was just over 30MB the node agent failed to start with
an OutOfMemoryError.

So introducing a new node to an established cluster fails in this scenario or at least it does for me.

Testing this the other way around by incrementally deploying several applications in
to a cluster where the instances are already running it is possible to go beyond
a total deployment size of 30MB. I have not tested the deployment of a single 30+MB
war file as yet.

I'm still not sure if this is causing the original “hanging synchronization” problem when
starting and stopping an instance. The reason why I say this is that I tested the case
where just over 30MB of war files was incrementally deployed into a node and then I
stopped and started the instance. It took a while but completed successfully.

However, when the total size of incrementally deployed war files greatly exceeds
30MB (say 50MB) the DAS hangs when starting the instance.

Some of the war files that we have are quite large – a result of transitive
dependencies in maven. We may be able to prune things down a bit, but if this
there is a deployment size limitation, this just delays the problem.

Is there a limit to the total deployment size that you know of and if so is there
some configuration property that can be set somewhere to at least work around
the problem in the short term?

By the way, would you still like to see the configurations files?

Also I am new to the dev.net site and still finding my way around. I noticed in my
profile that I also have a dev.java.net email account name, but no way of accessing it.
Is this something that has to be activated?

Thanks for helping and being patient.

Daniel Adelhardt

Hi,

There is a config option to specify heap settings for the node-agents
for syncing large apps. Have a look at the docs at
http://docs.sun.com/app/docs/doc/819-3679/abdkk?a=view and try to use a
Xmx setting of 128m or 256m for the sync process.

Let us know if that helps

Daniel

glassfish@javadesktop.org schrieb:
> I may have narrowed down the problem or at least identified the steps to reproduce it.
>
> The the OutOfMemoryError occurs when starting a new node instance in a cluster
> where the total size of the deployed artefacts (in this case several war files) exceeds
> 30MB.
>
> I have executed the following steps several times and consistently see OutOfMemoryError:
>
> 1. From the DAS console, create a new cluster.
> 2. Create a new node agent on a different server from the DAS. Don't start it yet.
> 3. From the DAS console or asadmin, add a new instance to the cluster on the
> node agent created in step 2.
> 4. Using asadmin, deploy over 30MB worth of war files into the cluster.
> 5. Start the node agent.
>
> I'm not sure if this is repeatable under MS Windows.
>
> I have tried a few dummy deployments where the total size of war files was just under
> 30MB. The node agent started, created and synchronized the instance directory
> successfully. I was also able to stop and start the instance successfully from the DAS.
>
> When the total deployment size was just over 30MB the node agent failed to start with
> an OutOfMemoryError.
>
> So introducing a new node to an established cluster fails in this scenario or at least it does for me.
>
> Testing this the other way around by incrementally deploying several applications in
> to a cluster where the instances are already running it is possible to go beyond
> a total deployment size of 30MB. I have not tested the deployment of a single 30+MB
> war file as yet.
>
> I'm still not sure if this is causing the original “hanging synchronization” problem when
> starting and stopping an instance. The reason why I say this is that I tested the case
> where just over 30MB of war files was incrementally deployed into a node and then I
> stopped and started the instance. It took a while but completed successfully.
>
> However, when the total size of incrementally deployed war files greatly exceeds
> 30MB (say 50MB) the DAS hangs when starting the instance.
>
> Some of the war files that we have are quite large – a result of transitive
> dependencies in maven. We may be able to prune things down a bit, but if this
> there is a deployment size limitation, this just delays the problem.
>
> Is there a limit to the total deployment size that you know of and if so is there
> some configuration property that can be set somewhere to at least work around
> the problem in the short term?
>
> By the way, would you still like to see the configurations files?
>
> Also I am new to the dev.net site and still finding my way around. I noticed in my
> profile that I also have a dev.java.net email account name, but no way of accessing it.
> Is this something that has to be activated?
>
> Thanks for helping and being patient.
> [Message sent by forum member 'gussie' (gussie)]
>
> http://forums.java.net/jive/thread.jspa?messageID=287971
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>
>

--
========================================================================
Daniel Adelhardt Tel: +49 89 460082443
Software Architect Mobile: +49 172 8417283
Sun Microsystems GmbH Email: daniel.adelhardt@sun.com
Sonnenallee 1
D-85551 Kirchheim-Heimstetten

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net