Skip to main content

ORB not listening on second server/instance in cluster

6 replies [Last post]
sgjava
Offline
Joined: 2005-07-05
Points: 0

Using GlassFish 3.1.2.2 with 2 servers, 2 nodes, 2 instances and deployed simple MDB to cluster (Conventional cluster with master broker). I can access on server1 and server2 via port 23700. Shutdown server1 and try to access from server2 and get NAM1007 : Problem with membership change notification. Exception occurred : java.lang.NullPointerException. If I look at ports on server2 23700 is not listening. I had to deploy an EAR with an MDB to get server1 listening on the ORB ports.

It appears that shutting down the master broker gracefully also shuts down the ORB listeners of the other instances. What happens if you want to do rolling maintenance?

Second scenario, I powered off master broker without GlassFish shutdown and ORB port stayed up on server2, but after message was delivered it was not removed from the queue. Stand alone client config:

System.setProperty("com.sun.appserv.iiop.endpoints","glassfish1.corp.local:23700,glassfish2.corp.local:23700");
System.setProperty("com.sun.iiop.loadbalancingpolicy", "ic-based");
System.setProperty("com.sun.corba.ee.transport.ORBWaitForResponseTimeout", "5000");
System.setProperty("com.sun.corba.ee.transport.ORBTCPConnectTimeouts","100:500:100:500");
System.setProperty("com.sun.corba.ee.transport.ORBTCPTimeouts","500:2000:50:1000");

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
hvilekar
Offline
Joined: 2006-10-06
Points: 0

Could you please include the steps used to duplicate the issue ? Probably the console output of the sequence of asadmin commands used to create the cluster, deploy the app and then shutdown the instance, any any other steps used, will help.

Thanks,

Harshad

sgjava
Offline
Joined: 2005-07-05
Points: 0

Centos 6.3 x86_64, Oracle JDK 6u35 (glassfish1 is master broker):

glassfish1:

ssh
sudo su -
/etc/init.d/iptables save
/etc/init.d/iptables stop
chkconfig iptables off
nano /etc/hosts
Add glassfish1 and glassfish2 hosts
adduser glassfish
passwd glassfish
chown -Rf glassfish:glassfish /opt/glassfish3
chmod 755 /etc/init.d/glassfish
chkconfig --add glassfish
chkconfig --level 234 glassfish on
chkconfig --list glassfish
exit
exit

glassfish2:

ssh
sudo su -
/etc/init.d/iptables save
/etc/init.d/iptables stop
chkconfig iptables off
nano /etc/hosts
Add glassfish1 and glassfish2 hosts
adduser glassfish
passwd glassfish
chown -Rf glassfish:glassfish /opt/glassfish3
chmod 755 /etc/init.d/glassfish
chkconfig --add glassfish
chkconfig --level 234 glassfish on
chkconfig --list glassfish
exit
exit
ssh
cd /opt/glassfish3/bin
./asadmin
start-domain domain1
change-admin-password --domain_name domain1
enable-secure-admin --port 4848
restart-domain domain1

glassfish1:

ssh
cd /opt/glassfish3/bin
./asadmin
setup-ssh glassfish1.corp.local glassfish2.corp.local
start-domain domain1
change-admin-password --domain_name domain1
enable-secure-admin --port 4848
restart-domain domain1
create-cluster mascluster
create-node-ssh --nodehost glassfish1.corp.local --installdir /opt/glassfish3 glassfish1
create-node-ssh --nodehost glassfish2.corp.local --installdir /opt/glassfish3 glassfish2
configure-jms-cluster --clustertype=conventional --configstoretype=masterbroker mascluster
create-instance --node glassfish1 --cluster mascluster masinstance1
create-instance --node glassfish2 --cluster mascluster masinstance2
start-cluster mascluster
list-nodes-ssh
exit
su -s /bin/sh -l glassfish -c "/opt/glassfish3/mq/bin/imqcmd -b glassfish1.corp.local:27676 list bkr"

Add connection factories to cluster
Add destinations to cluster
Deploy ear with test MDB via UI to cluster
Test ORB at 23700 and JMS mapper at 27676 on both servers

/etc/init.d script (for master, second instance the same except for start/stop cluster):

#!/bin/bash
#
# Startup script for GlassFish 3.1.2.2
#
# chkconfig: 234 80 20
# description: GlassFish Server Open Source Edition
# processname: glassfish

### BEGIN INIT INFO
# Provides: glassfish
# Required-Start: $network $syslog
# Required-Stop: $network $syslog
# Default-Start:
# Default-Stop:
# Short-Description: GlassFish Server
# Description: GlassFish Server Open Source Edition
### END INIT INFO

export JAVA_HOME=/usr/java/default
GLASSFISH_HOME=/opt/glassfish3/glassfish

case $1 in
start)

if [[ -z $(/sbin/pidof java) ]]; then
echo "Starting GlassFish"
/bin/su -s /bin/sh -l glassfish -c "$GLASSFISH_HOME/bin/asadmin start-domain domain1"
/bin/su -s /bin/sh -l glassfish -c "$GLASSFISH_HOME/bin/asadmin --user admin --passwordfile=/opt/glassfish3/glassfish/domains/domain1/config/admin-password.txt start-cluster mascluster"
touch /var/lock/subsys/glassfish
else
echo "GlassFish already running"
fi
;;
stop)
if [[ ! -z $(/sbin/pidof java) ]]; then
echo "Stopping GlassFish"
/bin/su -s /bin/sh -l glassfish -c "$GLASSFISH_HOME/bin/asadmin --user admin --passwordfile=/opt/glassfish3/glassfish/domains/domain1/config/admin-password.txt stop-cluster mascluster"
/bin/su -s /bin/sh -l glassfish -c "$GLASSFISH_HOME/bin/asadmin stop-domain domain1"
until [[ -z $(/sbin/pidof java) ]]; do :; done
rm -f /var/lock/subsys/glassfish
else
echo "GlassFish not running"
fi
;;
esac

exit 0

First scenario, shut down glassfish1 master broker with script above. This kills the ORB listeners on glassfish2.

Second scenario power off VM without stopping master broker, ORB still running on glassfish2. Connection takes a while and sends message to queue, but MDB never picks it up.

If both instances are running everything works as expected.

hvilekar
Offline
Joined: 2006-10-06
Points: 0

> First scenario, shut down glassfish1 master broker with script above.
> This kills the ORB listeners on glassfish2.

The stop) script is calling " stop-cluster mascluster". That stops both the instances of the cluster, and shuts down ORB listeners for both the instances. This is expected behavior.

sgjava
Offline
Joined: 2005-07-05
Points: 0

I'm good with that but what about hard power down of master broker? Shouldn't it fail over to glassfish2?

hvilekar
Offline
Joined: 2006-10-06
Points: 0

Before powering down glassfish1 VM, can you check if things works OK with following options (one at a time):

1. stop-instance masinstance1 or

2. kill the masinstance1 process (kill -9
)

Thanks,

Harshad

sgjava
Offline
Joined: 2005-07-05
Points: 0

OK, below are the results using stand-alone ORB and stand-alone direct JMS. I deleted/created the instances in the UI and ran my tests again. I guess I need a peer cluster because fail-over isn't working if master broker/master-config server is lost.

masinstance1 start, masinstance2 start (works as expected):

14:15:28.579 [main] DEBUG com.bhn.services.jms.ClusterTest - jndiLookup
14:15:41.203 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyQueue
14:15:41.436 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyTopic
14:15:41.446 [main] DEBUG com.bhn.services.jms.ClusterTest - Topic message received: Test message
14:15:41.464 [main] DEBUG com.bhn.services.jms.ClusterTest - directJms
14:15:46.935 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyQueue
14:15:47.289 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyTopic
14:15:47.361 [main] DEBUG com.bhn.services.jms.ClusterTest - Topic message received: Test message

masinstance1 stop, masinstance2 stop, cluster up (works as expected):

14:28:56.824 [main] DEBUG com.bhn.services.jms.ClusterTest - jndiLookup
14:29:16.875 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyQueue

13:46:13.007 [main] DEBUG com.bhn.services.jms.ClusterTest - directJms
13:46:13.121 [main] WARN javax.jms - [C4003]: Error occurred on connection creation [glassfish1.corp.local:27676]. - cause: java.net.ConnectException: Connection refused
13:46:16.126 [main] WARN javax.jms - [C4003]: Error occurred on connection creation [glassfish2.corp.local:27676]. - cause: java.net.ConnectException: Connection refused

13:46:16.164 [main] DEBUG com.bhn.services.jms.ClusterTest - jndiLookup
13:46:21.280 [main] ERROR j.e.s.c.n.c.s.enterprise.naming.impl - NAM1007 : Problem with membership change notification. Exception occurred : java.lang.NullPointerException

masinstance1 stop, masinstance2 start (works as expected):

14:10:03.483 [main] DEBUG com.bhn.services.jms.ClusterTest - directJms
14:10:03.611 [main] WARN javax.jms - [C4003]: Error occurred on connection creation [glassfish1.corp.local:27676]. - cause: java.net.ConnectException: Connection refused
14:10:06.806 [main] WARN javax.jms - [C4003]: Error occurred on connection creation [glassfish1.corp.local:27676]. - cause: java.net.ConnectException: Connection refused
14:10:09.930 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyQueue
14:10:10.196 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyTopic
14:10:10.211 [main] DEBUG com.bhn.services.jms.ClusterTest - Topic message received: Test message
14:10:10.249 [main] DEBUG com.bhn.services.jms.ClusterTest - jndiLookup
14:10:26.621 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyQueue
14:10:26.843 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyTopic
14:10:26.853 [main] DEBUG com.bhn.services.jms.ClusterTest - Topic message received: Test message

masinstance1 start, masinstance2 stop (works as expected):

14:18:24.225 [main] DEBUG com.bhn.services.jms.ClusterTest - directJms
14:18:28.233 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyQueue
14:18:29.306 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyTopic
14:18:29.332 [main] DEBUG com.bhn.services.jms.ClusterTest - Topic message received: Test message
14:18:29.350 [main] DEBUG com.bhn.services.jms.ClusterTest - jndiLookup
14:18:44.888 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyQueue
14:18:45.146 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyTopic
14:18:45.153 [main] DEBUG com.bhn.services.jms.ClusterTest - Topic message received: Test message

masinstance1 start, masinstance2 start, hard power off master glassfish1 VM (hangs waiting for queue to empty, so message is not processed by MDB onMessage, but ORB based lookup worked):

14:28:56.824 [main] DEBUG com.bhn.services.jms.ClusterTest - jndiLookup
14:29:16.875 [main] DEBUG com.bhn.services.jms.ClusterTest - waitForEmptyQueue