Hi All,
Jay SenSharma

Jay SenSharma

Multicasting is the technology of delivering a message to a group of destination nodes (servers/jvms/any other network based software) Means broadcasting the piece of information to all it’s peers. Usually the IP Address range for multicast communication is any IP Address in the range of to
Application Server Clustering is one of the best implementation of Multicasting technology. In Middleware world Cluster is a logical entity in which many Member (Servers) will work together to provide LoadBalancing (Load Sharing), Failover (Reliablity) and Scalability to our applications.
Cluster Members usually communicates with each othr using following two ways:

1). IP Multicasting (One to Many):

In This technique every node of a cluster broadcast some piece of data/information to all the other members of the same Cluster. Using this techniqueue Servers achieves 2 main goals….
a). Each nodes sends the heartbeat messages to other nodes of the cluster. This makes other node of the cluster aware that the member whoever is sending messages is alive. The heartbeat message broadcasting helps the cluster master to maintain the “Dynamic Server List” (A List of servers who all are alive).
b). IP Multicasting techniqueue is also used for the JNDI objects replication among all the members of the cluster. The Object binded in the JNDI tree of a Clustered Node (Server) is broadcasted to rest of the members of the Cluster. It means the JNDI tree of a Clustered Server will be identicle to other members of the same cluster.

2). IP Socketing (One to One):

This technique is broadly used by the middleware cluster members for accessing the object from any other node of the cluster. This technique is actually used by the Cluster members to replicate the HttpSession Data or the EJB Session Objects.

Multicast Errors:

If we observe the multicast errors in the Server Logs …then it means our Cluster is not going to work as expected…one or more node of the clusters may be kicked out of the cluster….The errors will look something like this in the Server Logs:
<Error> <Cluster> <BEA-000110> <Multicast socket receive error: java.io.OptionalDataException
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1285)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:322)
at weblogic.cluster.MulticastManager.execute(MulticastManager.java:411)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:224)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:183)
java.lang.OutOfMemoryError: PermGen space
<Error> <Cluster> <testWeb> <MS1> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <1264189488263> <BEA-000110> <Multicast socket receive error: java.lang.OutOfMemoryError: PermGen space
java.lang.OutOfMemoryError: PermGen space
at sun.misc.Unsafe.defineClass(Native Method)
at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:45)
at sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:381)
at java.security.AccessController.doPrivileged(Native Method)
at sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:377)
at sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:95)
at sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:313)
at java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1299)
at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:52)
<Error> <Cluster> <testDomain> <testServer> <ExecuteThread: '14' for queue: 'weblogic.kernel.Default'> <<WLS Kernel>> <BEA-000110> <Multicast socket receive error: java.io.StreamCorruptedException
at java.io.ObjectInputStream$BlockDataInputStream.readBlockHeader(ObjectInputStream.java:2347)
at java.io.ObjectInputStream$BlockDataInputStream.refill(ObjectInputStream.java:2380)
at java.io.ObjectInputStream$BlockDataInputStream.read(ObjectInputStream.java:2452)
at java.io.DataInputStream.readInt(DataInputStream.java:443)
at java.io.ObjectInputStream$BlockDataInputStream.readInt(ObjectInputStream.java:2657)
at java.io.ObjectInputStream.readInt(ObjectInputStream.java:900)
at weblogic.cluster.MulticastManager.execute(MulticastManager.java:387)
<Error> <Cluster> <BEA-000110> <Multicast socket receive error: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readLong(DataInputStream.java:380)
at java.io.ObjectInputStream$BlockDataInputStream.readLong()J(Unknown Source)
at java.io.ObjectInputStream.readLong()J(Unknown Source)
at weblogic.cluster.HeartbeatMessage.readExternal(HeartbeatMessage.java:55)
at java.io.ObjectInputStream.readExternalData(Ljava.io.Externalizable;Ljava.io.ObjectStreamClass;)V(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Z)Ljava.lang.Object;(Unknown Source)
<Error> <Cluster> <Multicast socket receive error : java.io.InterruptedIOException: Receive timed out
java.io.InterruptedIOException: Receive timed out
at java.net.PlainDatagramSocketImpl.receive(Native Method)
at java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:90)
at java.net.DatagramSocket.receive(DatagramSocket.java:404)
at weblogic.cluster.FragmentSocket.receive(FragmentSocket.java:145)
at weblogic.cluster.MulticastManager.execute(MulticastManager.java:298)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:139)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:120)

Most Possible Causes of the Multicast Errors ?

Cause-1). Incorrect configuration of Multicast Addresses.
Cause-2). Less Number of File Descriptors Availability
Cause-3). Network Fluctuation/Interrupted Network Connectivity.
Cause-4). Multicast Blocking due to some Firewall restrictions. Try disabling the “iptables” as described at the bottom of the page.
Cause-5). Multicast Timeouts. Increase the MulticastTTL.
Cause-6). More than one Clusters present in the same Network using the same Multicast address.
Cause-7). In correct usages of Operating System Zoning or Multihoming Issues. MultiHoming means a Single Physical Box with multiple NIC Cards (Multiple IP Addresses)
Cause-8). Servers Listen Port was used as the Multicast Port.

What All Things Need to Debug ?

It is Always recommended to first of all go through the following link: http://download.oracle.com/docs/cd/E12840_01/wls/docs103/cluster/multicast_configuration.html

Point-1). Check & Make Sure that all the Clusters present in the same Network are using Uniqueue Multicast Address And Port
Point-2). Using “netstat” or “ping” commands we need to make sure that the Multicast Address and the Port are Ok for use.
Point-3). Opening a Socket or a File requires File Descriptors…So we Need to make sure that the required number of “File Descriptors” are available or not?
In Unix based OS we can use “lsof”  command (List Of Open Files)
Example :   “lsof -p <WLS_PID> | wc -l”
Here WLS_PID is the Process ID of WebLogic. To find WebLogic Process ID please refer to:  http://middlewaremagic.com/weblogic/?p=2291
Example: Suppose if the WebLogic Server’s Process ID is 4020 then run the following command:
[jaytest@jaytest bin]$ lsof -p 4020 | wc -l

[jaytest@jaytest bin]$ lsof -p 4020
java    4020 jaytest  cwd    DIR              253,4     4096 8657973 /NotBackedUp/WLS103/user_projects/domains/base_domain
java    4020 jaytest  rtd    DIR              253,1     4096       2 /
java    4020 jaytest  txt    REG              253,3    50810  133154 /home/jaytest/MyJdks/jdk1.6.0_21/bin/java
java    4020 jaytest  mem    REG              253,1   150672  542805 /lib64/ld-2.12.so
java    4020 jaytest  mem    REG              253,1  1838296  542806 /lib64/libc-2.12.so
java    4020 jaytest  mem    REG              253,1   145672  542818 /lib64/libpthread-2.12.so
java    4020 jaytest  mem    REG              253,1    22536  542808 /lib64/libdl-2.12.so
java    4020 jaytest  mem    REG              253,1   598816  542807 /lib64/libm-2.12.so
java    4020 jaytest  mem    REG              253,1    47072  542819 /lib64/librt-2.12.so
java    4020 jaytest  mem    REG              253,1   113904  542814 /lib64/libresolv-2.12.so
java    4020 jaytest  mem    REG              253,1   116136  542840 /lib64/libnsl-2.12.so
java    4020 jaytest  mem    REG              253,3     6676  133795 /home/jaytest/MyJdks/jdk1.6.0_21/jre/lib/amd64/librmi.so
java    4020 jaytest  mem    REG              253,3  1163700  133902 /home/jaytest/MyJdks/jdk1.6.0_21/jre/lib/resources.jar
java    4020 jaytest  mem    REG              253,3   842216  134434 /home/jaytest/MyJdks/jdk1.6.0_21/jre/lib/ext/localedata.jar
java    4020 jaytest  mem    REG              253,4   282279 8653766 /NotBackedUp/WLS103/wlserver_10.3/server/lib/consoleapp/consolehelp/WEB-INF/lib/console.jar
java    4020 jaytest  mem    REG              253,4   293750 8653774 /NotBackedUp/WLS103/wlserver_10.3/server/lib/consoleapp/consolehelp/WEB-INF/lib/standard.jar
java    4020 jaytest  mem    REG              253,4   779658 8653764 /NotBackedUp/WLS103/wlserver_10.3/server/lib/consoleapp/consolehelp/WEB-INF/lib/beehive-netui-core.jar
java    4020 jaytest  mem    REG              253,4    57299 8653775 /NotBackedUp/WLS103/wlserver_10.3/server/lib/consoleapp/consolehelp/WEB-INF/lib/struts-adapter.jar
java    4020 jaytest  mem    REG              253,4   531676 8653767 /NotBackedUp/WLS103/wlserver_10.3/server/lib/consoleapp/consolehelp/WEB-INF/lib/jh.jar
java    4020 jaytest  mem    REG              253,4  1490143 8653770 /NotBackedUp/WLS103/wlserver_10.3/server/lib/consoleapp/consolehelp/WEB-INF/lib/netuix_servlet.jar
java    4020 jaytest  mem    REG              253,4    54683 8653769 /NotBackedUp/WLS103/wlserver_10.3/server/lib/consoleapp/consolehelp/WEB-INF/lib/netuix_common_web.jar
java    4020 jaytest  mem    REG              253,4    46008 8653772 /NotBackedUp/WLS103/wlserver_10.3/server/lib/consoleapp/consolehelp/WEB-INF/lib/render_taglib.jar
Point-4). If we see any kind of “Multicast Receive timeout error” It means  we need to check the NIC card functioning properly or not.
Point-5). There May be Many a Multicast Storm going on in the Network (Storm means repeated transmission of the Multicast packets over the network). In this case we can try increasing the Multicast Buffer Size. Using “udp_max_buf” Parameter we can increase it. Please refer to :  http://docs.sun.com/app/docs/doc/816-0607/6m735r5gb?a=view for more details on it.
Point-6). In case of Multicast storm the network may be already flooded with the Multicast messages. If we find this then please disable the “igmp” snooping switch. This switch is part of the Internet Group Management Protocol (IGMP) and is used to prevent multicast flood problems on the managed switch.
Example:   igmp snooping=disable
For more details on this parameter Please refer to:  http://documentation.netgear.com/gs108t/enu/202-10337-01/GS108T_UM-06-21.html
Point-7). Set the Multicast Time-To-Live to the following:  MulticastTTL=32
NOTE: For WAN kind of larger network the Multicast Time To Live Parameters value must be kept High…sothat the Routers will not discard the Multicast Packets before they reach the Message destination.
Point-8). Perform the MulticastMonitor Test & MulticastTest  on the network…As described in the following link: http://middlewaremagic.com/weblogic/?p=980
Point-9). Try to enable the Cluster Debug to get more details:
java weblogic.Admin -url t3://localhost:7001 - username weblogic -password weblogic SET -type ServerDebug -property DebugCluster true
Point-10). If the Multicast still doesnt work….:) then disable the IPTables….And then check  http://kr.forums.oracle.com/forums/thread.jspa?threadID=767088
In RedHat Linux:        /etc/init.d/ipdables stop



Jay SenSharma
If you enjoyed this post, please considerleaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.