JVM Tuning

How to Generate HeapDump in case of Windows Service ?

Hi,

Jay SenSharma

Jay SenSharma

Based on the query of Magic Subscriber “fipnova51we are writing this article.  This article is very very important in case of Windows Service.  Because if we run any Server like JBoss or WebLogic like a Windows Service then NONE of the JVM utilities like “jmap”, “jconsole” , “jps” …etc will work. So it becomes very difficult to collect the Thread Dump or Heap Dump of the Running Service.

We wrote an article using JMX to collect all the JVM details using JMX code “Remote JVM Analysis (Mini JConsole)“. Here we are extending the same program to make the JMX program to even collect the HeapDump whenever we want.

Step1). Make a Windows Service for example you can use the following Script to make your WebLogic Server as Windows Service…

echo off
SETLOCAL
set DOMAIN_NAME=Test_Domain
set USERDOMAIN_HOME=C:\bea103\user_projects\domains\Test_Domain
set SERVER_NAME=AdminServer
set WLS_USER=weblogic
set WLS_PW=weblogic
set PRODUCTION_MODE=true
set JAVA_VENDOR=Sun
set JAVA_HOME=C:\bea103\jdk160_05
set MEM_ARGS=-Xms1024m -Xmx1024m  -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9999 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=127.0.0.1

set JAVA_OPTIONS=-Dweblogic.Stdout=C:\bea103\user_projects\domains\Test_Domain\ASstdout.txt -Dweblogic.Stderr=C:\bea103\user_projects\domains\Test_Domain\ASstderr.txt %MEM_ARGS%
call "C:\bea103\wlserver_10.3\server\bin\installSvc.cmd"
ENDLOCAL

NOTE: Make sure that the -Dcom.sun.management.* flags are added properly in the JAVA_OPTIONS of the script.

Step2). Now You can also use the following Script to Uninstall your WebLogic Server Windows Service …

SETLOCAL
set WL_HOME=C:\bea103\wlserver_10.3
rem *** Uninstall the service
"%WL_HOME%\server\bin\beasvc" -remove -svcname:"beasvc Test_Domain_AdminServer"
ENDLOCAL

Step3). Now Write the following JMX code “GenerateHeapDump.java”  inside some directory.

import java.util.Hashtable;
import java.io.*;
import javax.management.MBeanServerConnection;
import javax.management.remote.JMXConnector;
import javax.management.remote.JMXConnectorFactory;
import javax.management.remote.JMXServiceURL;
import javax.naming.Context;
import java.lang.management.MemoryMXBean;
import java.lang.management.ManagementFactory;
import java.lang.reflect.Method;
import java.lang.reflect.Modifier;
import javax.management.*;
import javax.management.openmbean.CompositeDataSupport;

public class GenerateHeapDump
{
private static MBeanServerConnection connection;
private static JMXConnector connector;
public static void Connection(String hostname, String port) throws IOException
{
Integer portInteger = Integer.valueOf(port);
Hashtable h = new Hashtable();
JMXServiceURL address = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://"+hostname+":"+port+"/jmxrmi");
connector = JMXConnectorFactory.connect(address,null);
connection = connector.getMBeanServerConnection();
System.out.println("GOT THE MBeanServerConnection---SUCCESSFULLY");
}

private static void doGarbageCollection() throws Exception
{
ObjectName memoryMXBean=new ObjectName("java.lang:type=Memory");
connection.invoke(memoryMXBean,"gc", null, null);
System.out.println("\n\n\t------Garbage Collection Done Successfully-----");
}

private static void getHeapDump(String fileName) throws Exception
   {
      ObjectName memoryMXBean=new ObjectName("com.sun.management:type=HotSpotDiagnostic");
      Object[] params = new Object[] { "C:/WindowsService_Script/"+fileName, Boolean.TRUE };
      String[] signature = new String[] { String.class.getName(), boolean.class.getName() };
      Object result = connection.invoke(memoryMXBean, "dumpHeap" , params, signature);
      System.out.println("\n\t Heap Dump Generated to " +fileName);
}

public static void main(String[] args) throws Exception
{
String hostname = args[0];
String port = args[1];
Connection(hostname, port);
System.out.println("\n\t----------Generating Heap Dump---------");
getHeapDump("HeapDump.hprof");
connector.close();
}
}

Step4). Run your Windows Service and then compile and run your Program to generate the HeapDump.

C:\WindowsService_Script>set PATH=c:\bea103\jdk160_05\bin;%PATH%;

C:\WindowsService_Script>javac GenerateHeapDump.java

C:\WindowsService_Script>java GenerateHeapDump localhost 9999
GOT THE MBeanServerConnection---SUCCESSFULLY

        ----------Generating Heap Dump---------

         Heap Dump Generated to HeapDump.hprof

Step5). Now you can see that the Heap Dump is generated as desired … :)

.

.

Thanks

Jay SenSharma


Security Breach And Attack For Java Based Application Servers

Hi,

Jay SenSharma

Jay SenSharma

DISCLAIMER:

In this article we may see an abnormal behaviors of weblogic. Which may not be necessarily a BUG but it is always to be aware of such behavior while using Weblogic. The idea behind making this page is just to make awareness among the WebLogic Admins to be alert specially when the some of these behaviors are related to WebLogic Security.

Some of the behaviors of WebLogic which may be due to inappropriate Security implementation in the Security system of WebLogic, Even if in some cases it is work as designed, Still it suggests to keep an eye on it and try to make those features more enhanced. Some of them are now fixed by the Application Server Vendor but still some need to be fixed or enhanced. The intentions here are not to point to the weak points of any Application Server but solely to make people aware about such strange or uncommon behaviors.

==========================================================================
Any WebServer or Application Server which runs on below mentioned JVM are not safe due to the security breach. For example if you just want to hang A server then just sent the following request to the Server using any HttpClient like JMeter or any other Utility which allows you to send the Http Header of your Choice.

Once you are able to send the following Http Request Header successfully to the Java based Application/Web Server …the Server will try to parse the Http Request Header and it will Hang  while processing this request.

“GET”,”/”,headers={“Accept-Language”: “en-us;q=2.2250738585072012e-308″}

Just for an example try to run the following simple Java program which just tries to parse a Double value 2.2250738585072012e-308. As soon as you will run this program you will see that your JVM will Hang….and the CPU Utilization will be around 100%   ;)

class HangJVM
{
  public static void main(String[] args)
    {
      System.out.println("Test:");
      double d = Double.parseDouble("2.2250738585072012e-308");
      System.out.println("Value: " + d);
     }
}

The JVMs which are affected are as following:
Java SE
JDK and JRE 6 Update 23 and earlier for Windows, Solaris, and Linux
JDK 5.0 Update 27 and earlier for Solaris 9
SDK 1.4.2_29 and earlier for Solaris 8

Java for Business
JDK and JRE 6 Update 23 and earlier for Windows, Solaris and Linux
JDK and JRE 5.0 Update 27 and earlier for Windows, Solaris and Linux
SDK and JRE 1.4.2_29 and earlier for Windows, Solaris and Linux

Save your JVM and your Application Server From Attack or Contact your Support ;)

Or Please get a Fix from Support  Which Updates the “rt.jar” of the JVM The Fix Details are available in the following Link  http://middlewaremagic.com/weblogic/?p=5393#comment-2821

And

Regarding the “Oracle Security Alert for CVE-2010-4476″You can get the Temp Security Patches related to this issue from the following link:http://www.oracle.com/technetwork/topics/security/alert-cve-2010-4476-305811.html

.
Thanks
Jay SenSharma


Analyzing Garbage Collection Log

Hi,
Jay SenSharma

Jay SenSharma

It’s always best to enable the Garbage collection Logging in our production environment as well because it does not cause any resource overhead or any side effect on weblogic server or an other application server’s performance.  GC log helps us in investigating man issues. Apart from issues it helps us to find out if some tuning is required based on the statistics of the Garbage collection.
.
Garbage collection logging can be enable and collected in a separate log file by using the following JAVA_OPTIONS:
-Xloggc:D:/gcLogs/GCLogs.log         -XX:+PrintGCDetails        -XX:+PrintGCTimeStamps
As soon as you add these JAVA_OPTIONS which are JVM specific (above will work for Sun and Open JDKs fine) the JVM will start generating the garbage collection logging in the GCLog.log file. Now if you will open this file then you can
see something like following:
4.636: [GC [PSYoungGen: 230400K->19135K(268800K)] 230400K->19135K(2058752K), 0.0635710 secs] [Times: user=0.08 sys=0.01, real=0.06 secs]
7.302: [GC [PSYoungGen: 249535K->38396K(268800K)] 249535K->51158K(2058752K), 0.0777300 secs] [Times: user=0.21 sys=0.04, real=0.07 secs]
7.521: [GC [PSYoungGen: 49735K->38388K(268800K)] 62496K->51933K(2058752K), 0.0741680 secs] [Times: user=0.15 sys=0.04, real=0.07 secs]
7.595: [Full GC (System) [PSYoungGen: 38388K->0K(268800K)] [PSOldGen: 13545K->51794K(1789952K)] 51933K->51794K(2058752K) [PSPermGen: 19868K->19868K(39936K)], 0.3066610 secs] [Times: user=0.28 sys=0.02, real=0.31 secs]
9.752: [GC [PSYoungGen: 230400K->26206K(268800K)] 282194K->78000K(2058752K), 0.0728380 secs] [Times: user=0.15 sys=0.00, real=0.08 secs]
11.906: [GC [PSYoungGen: 256606K->38393K(268800K)] 308400K->94759K(2058752K), 0.1058920 secs] [Times: user=0.19 sys=0.00, real=0.10 secs]
13.480: [GC [PSYoungGen: 268793K->38394K(268800K)] 325159K->109054K(2058752K), 0.0762360 secs] [Times: user=0.20 sys=0.03, real=0.08 secs]
18.115: [GC [PSYoungGen: 268794K->38384K(268800K)] 339454K->179238K(2058752K), 0.1351350 secs] [Times: user=0.42 sys=0.10, real=0.14 secs]
20.860: [GC [PSYoungGen: 268784K->38394K(268800K)] 409638K->200343K(2058752K), 0.1063430 secs] [Times: user=0.29 sys=0.03, real=0.11 secs]
22.148: [GC [PSYoungGen: 268794K->38399K(268800K)] 430743K->221395K(2058752K), 0.1173980 secs] [Times: user=0.24 sys=0.02, real=0.12 secs]
23.357: [GC [PSYoungGen: 268799K->26775K(268800K)] 451795K->231618K(2058752K), 0.0714130 secs] [Times: user=0.15 sys=0.03, real=0.08 secs]
24.449: [GC [PSYoungGen: 257175K->29170K(268800K)] 462018K->239909K(2058752K), 0.0312400 secs] [Times: user=0.06 sys=0.01, real=0.04 secs]
You can notice something in the above output:
Point-1). [Full GC (System) [PSYoungGen: 38388K->0K(268800K)]    It means a Full GC is happening on the complete Heap Area including all the Areas of the Java Heap Space.
.
Point-2). [GC [PSYoungGen: 230400K->19135K(268800K)]   Indicates some small GCs which keep on happening in the young generation very frequently,This garbage collection cleans the Young Generation short living Objects.
.
Point-3). Meaning of the [GC [PSYoungGen: 230400K->19135K(268800K)]   line is around 256MB (268800K) is the Young Generation Size, Before Garbage Collection in young generation the heap utilization in Young Generation area was around  255MB (230400K)  and after garbage collection it reduced up to 18MB (19135K)
.
Point-4). Same thing we can see for Full Garbage collection as well….How effective the Garbage collection was…[Full GC (System) [PSYoungGen: 38388K->0K(268800K)] [PSOldGen: 13545K->51794K(1789952K)]  Here it says that around
[(old)1789952K +  young (268800K) ]  memory space means  OldGeneration is consuming 1.75GB space and Young Generation is consuming around 255 MB space  So it means total Heap size is around 2GB.
.
But analyzing the Garbage collection log like above technique Line by Line is very bad…so here we have an alternative was to analyze the Garbage Collection log in few Seconds to see how much time the Full Garbage collection is taking as an average and other reports…etc.
.
Step1). Download the “garbagecat-1.0.0.jar   (881 KB) ”  tool from the follwing link: http://garbagecat.eclipselabs.org.codespot.com/files/garbagecat-1.0.0.jar
.
Step2). Open a command prompt and then make sure that JAVA is set in the Path so that we can use “jar” utility of JDK to run the “garbagecat-1.0.0.jar”  tool.
.
Step3). Put the “garbagecat-1.0.0.jar”  file and the “GCLog.log” file in the same directory. then run the following command:
java      -jar      garbagecat-1.0.0.jar      GCLog.log
.
Step4). As soon as ou run the above command you will see that in your current directory following files are created:
garbagecat-1.0.0.jar
GCLog.log
gcdb.lck
gcdb.log
gcdb.properties
report.txt
.
Step5). Now open the “report.txt” file to see the Over all report of the Garbage Collection something like following:
========================================
SUMMARY:
========================================
# GC Events: 12
GC Event Types: PARALLEL_SCAVENGE, PARALLEL_SERIAL_OLD
Max Heap Space: 2058752K
Max Heap Occupancy: 462018K
Max Perm Space: 39936K
Max Perm Occupancy: 19868K
Throughput: 95%
Max Pause: 306 ms
Total Pause: 1233 ms
First Timestamp: 4636 ms
Last Timestamp: 24449 ms
========================================
.
If you see that the Garbage Collection Max Pause time is very high like  more than 5-7 Seconds for a 2 GB heap then you need to worry about it. ;)
NOTE: Garbagecat  is a best utility to generate the Garbage Collection Report for Sun JDK and Open JDK for other JDKs you should use other tools for accurate results.
.
.
Thanks
Jay SenSharma

High CPU Utilization Finding Cause?

Hi All,

Jay SenSharma

Jay SenSharma

High CPU utilization is a very very common thing for an Application Server Administrator. Almost every week or quarter or may be every day we can see some High CPU utilization by the Application Server. It is very important for an Administrator or System Admin or a Application Server Support Person to find out what is causing the High CPU.

.
High CPU may cause System Crash or Server Crash or Slow Responsiveness of Server. So we need to find out which Thread is actually consuming more CPU cycles this will help us to find out whether the Thread is processing some application code ….there may be a code bug or it may be Application Framework bug as well…etc

.
So here we are going to see a very simple demonstration of generating High CPU and then we will see How to Analyze High CPU utilization. For this purpose we are going to use 2 very basic utilities

.
1). jps: to know How to use JPS utility please refer to : http://middlewaremagic.com/weblogic/?p=2291
2). top: This is an Operating System Utility .. Please refer to “man” (Manual of your OS like Linux/Solaris)
3). jstack: to know how to use jstack please refer to : http://middlewaremagic.com/weblogic/?p=2281

Generating High CPU

Step1). Create a Simple Web Application with the following JSP code:
“HighCpuGenerator.jsp”


<%
for (int i=0; i < 3; i++)
    {
       Thread x=new Thread(new Runnable(){

                  public void run()
                    {
                       System.out.println("Thread " +Thread.currentThread().getName() + " started");
                       double val=10;
                       for (;;)
                         {
                            Math.atan(Math.sqrt(Math.pow(val, 10)));
                         }
                     }
              });

        x.start();
    }
%>

<h3>High CPU generation ....Please collect the 4-5 times "Top" command Output and the 4-5 ThreadDumps.</h3>

Step2). Now Deploy your Application which is having this JSP.
Step3). Hit the Application using “http://localhost:7001/HighCpuWebApp/HighCpuGenerator.jsp

Analyzing High CPU:

Step4). Open a shell prompt and then get the Servers Process ID using “jps” utility which comes with Jdk1.5 and Higher inside the \bin directory.

[jsenshar@jsenshar ~]$ export PATH=/home/jsenshar/MyJdks/jdk1.6.0_21/bin:$PATH
[jsenshar@jsenshar ~]$ jps -v

HighCpu_Collect5ing Process ID using JPS

HighCpu_Collect5ing Process ID using JPS

Step5). Now In the same command prompt run the “JStack” utility to collect some Thread Dumps in a File… Suppose if the Application Servers Process ID is “5398″ which we got in previous step.

[jsenshar@jsenshar ~]$ jstack -l 5398 > ThreadDumps.log
[jsenshar@jsenshar ~]$ jstack -l 5398 >> ThreadDumps.log
[jsenshar@jsenshar ~]$ jstack -l 5398 >> ThreadDumps.log
[jsenshar@jsenshar ~]$ jstack -l 5398 >> ThreadDumps.log
[jsenshar@jsenshar ~]$ jstack -l 5398 >> ThreadDumps.log
HighCpu_ThreadDumps_JStack

HighCpu_ThreadDumps_JStack

Step6). Now In Parallel to collecting the Thread Dumps please collect the “top” commands output as well….like
(NOTE: Make sure that u collect the Thread Dumps and the Top commands results in Parallel whenever u observe any kind of High CPU utilization or Server Slow response)

[jsenshar@jsenshar HighCpu]$ top -H -b -p 5398

HighCPU_Top_Report

HighCPU_Top_Report

You will see some results like this :

top - 02:52:09 up 18:08,  4 users,  load average: 3.49, 3.25, 8.02
Tasks:  51 total,   6 running,  45 sleeping,   0 stopped,   0 zombie
Cpu(s): 96.6%us,  0.7%sy,  0.0%ni,  2.6%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   3848964k total,  3352224k used,   496740k free,    50940k buffers
Swap:  4194296k total,   400804k used,  3793492k free,  1116028k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5522 jsenshar  20   0 2884m 412m  19m R 77.8 11.0  40:00.46 java
 5521 jsenshar  20   0 2884m 412m  19m R 66.9 11.0  39:57.34 java
 6900 jsenshar  20   0 2884m 412m  19m R 65.2 11.0   0:07.82 java
 5520 jsenshar  20   0 2884m 412m  19m R 54.2 11.0  39:57.70 java

Step7). Now U will observe that there are some Thread IDs which are consuming more CPU Cycles like Child Thread PID=5522 is consuming 77.8% of CPU similarly Child Thread PID 5521 is consuming 66.9 % of CPU Cycles. These are very Hign CPU utilization Threads inside the Main Parent Process ID 5398.

Step8). Now convert those PIDs into Hexadecimal values…..Example     the Hex Decimal Value for     5522 will be      1592. So now please open the Thread Dump and then find out the word 1592

Decimal Value of the Thread ID provided by OS

Decimal Value of the Thread ID provided by OS

Converted the Decimal Thread ID Value to Hexa Decimal Code

Converted the Decimal Thread ID Value to Hexa Decimal Code

"Thread-17" daemon prio=10 tid=0x00007fe824439800 nid=0x1592 runnable [0x00007fe7fa9e2000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.StrictMath.atan(Native Method)
        at java.lang.Math.atan(Math.java:187)
        at jsp_servlet.__highcpugenerator$1.run(__highcpugenerator.java:83)
        at java.lang.Thread.run(Thread.java:619)

Here you will see that The Thread with id=0×1592 is actually causing the High CPU …Now u can tell your Developers to check the Stack Trace of this Thread so that they can correct their Application Code or If you see any API Code or Framework code there then you know what u need to do and where the problem is.

Like in above case u can see that  My JSPs code is actually causing High CPU    ”at jsp_servlet.__highcpugenerator$1.run(__highcpugenerator.java:83)

.

.

Thanks

Jay SenSharma


Causes and First Aid of JVM Crash Issues?

Hi All,
Jay SenSharma

Jay SenSharma

Java Virtual Machine is a Native engine which allows our Java Applications to run. It performs the code optimization to improve the performance. In correct tuning, Low memory allocation, extensive code optimization, bad garbage collection strategy, API code leaking…etc are some of the reasons which may cause the JVM to crash.
.
Analyzing a JVM Crash is one of the very interesting and little time taking process sometimes it is even little complex to find out the root cause of the JVM Crash. Here in this article we will see some of the common mistakes, first aid solutions/debugging techniques to find out what kind of informations we can get by looking into the Core Dump.

What is Core Dump & Where to Find It?

Code dump is usually a Binary file which gets generated by the Operating System when JVM or any other process crashes. Sometimes it also happens that the JVM will not be able to generate the Crash dump. In Windows Operating Systems it will be generated in the Directore where the “Dr. Watson” tool is installed. In Windows it will be usually:  ”C:\Documents and Settings\All Users\Application Data\Microsoft\Dr Watson
.
By default in Unix based Operating Systems the Core Dump files are created in the directory where the Java Program/Server was started even sometimes it is generated in the “/tmp” directory of the Operating System. But using the following Java Option we can change it’s the Crash Dump/Heap   Dump generation locations:  -XX:HeapDumpPath=/opt/app/someLocaton/ and  -XX:+HeapDump JVM Options.
.
NOTE: These Flags does not gurantee that always the Heap/Crash dump will be generated at the time of JVM Crash. There are some more reasons behind the Core Dump not gets generated…like Process Limitations or the Less Disk Quota or unavailability of the Free File Descriptors.

Who Generates the Crash/Core Dump?

JVM does not generate the Coe dump. Rather it is the Operating System which generates the Core Dump. Core Dump is a Binary file which may be several hundred Mega Bytes or Giga Bytes in size. The Operating systems just log the exception/error messages and the details of the Threads along with the Native libraries loaded with that java process.
.
Many times a brief Textual Crash file is also generated by the JVM itself sometimes during Crash. Usually the file name is “hs_err_pid<WebLogicPID>.log” in case of Sun JDK. Similarly JRockit JVM also generates a Textual file with name “*.dump” in case of JVM Crash.

Use of -XX:+ShowMessageBoxOnError?

The Thread Dump is also very helpful to analyze the Server Crash. Thread dump tells us what was the status and activities performed by the Threads at the time of crash. It may be possible to get a thread dump before the process exits. HotSpot supports the JAVA_OPTIONS -XX:+ShowMessageBoxOnError
.
The corresponding JRockit JVM Option is -Djrockit.waitonerror.  When the JVM is crashing, it may prompt the user ” Do you want to debug the problem? ” This pauses the process, thereby creating an opportunity to generate a thread dump (a stack trace of every thread in the JVM), attach a debugger, or perform some other debugging activity.  However, this does not work in all cases (for eg., in case of stack overflow).
.
Along with above there are various options available to get the Thread Dumps as described in : http://middlewaremagic.com/weblogic/?p=823

What May Cause JVM Crash?

Reason-1). Usually Native Code causes the JVM Crash. Native code is a code written in Languages like C/C++, Java Native Interface APIs (JNI).
.
Reason-2). JDBC Drivers specially the Native Drivers.
.
Reason-3). JVM Code Optimization.
.
Reason-4). Less Memory availabity for Native Area of a Java Process.
.
Reason-5). Application Servers Native Performance Pack Libraries.
.
Reason-6). JVMs library itself can cause the Crash.
.
Reason-7). High CPU Utilization by the Threads. As described in : http://middlewaremagic.com/weblogic/?p=4348
.
Reason-8). Presence of Wrong Native Libraries in the PATH or in the “-Djava.library.path
.
Reason-9). Presence of A Different Version of Libraries in “java.library.path” or in “SHLIB” or in “LD_LIBRARY_PATH” variables. Like setting a 64-bit version of Library in a 32-bit version of JVM’s library path or vise-versa.
.
.

Tools To Analyze the Core Dumps?

Core/Crash dump is Operating System specific, So to analyze these Dumps we must use the Tools provided by the same Operating System vendors. Various kind of tools are provided by the Operating Systems to analyze these Core Dumps like:
Tool-1). Dr. Watson Tool  in Windows OS. Windows OS Start (Button)—>Run…—>drwtsn32
.
Tool-2). “pstack” and “pmap” in Solaris Operating System.
.
Tool-3). “procstack” and “procmap” in AIX Operating System.
.
Tool-4). “lsstack” and “pmap” in Linux Operating System.
.
Tool-5). “pflag” if available in HP-UX Operating System.
.
.

What May Help To Avoid JVM Crash?

It totally depends on What Caused the Crash or What Libraries caused the JVM crash to avoid the occurance of the JVM crash for next time. But following things should be taken in consideration while analyzing and avoiding the Crash.
.
Point-1). If the Native Jdbc Driver is causing issues, If our appliation is using the Native JDBC Drivers then Switching from Pure Native Jdbc Driver (Type-2 Drivers) to the Pure Java JDBC Driver (Type-4 Driver) may help.
.
Point-2). If the JVM Libraries are causing the Crash then Upgrading to a Later Version of the JDK. If that Application Server has that new JDK in it;s Supported Configuration list.
.
Point-3). If the JVM Code Optimization is causing the Crash, then Disabling the Code Optimization of JVM by applying the JVM Options.
For JRockit JVM Code Optimization can be disabled using JAVA_OPTION  -Xnoopt
For Sun JDK Code Optimization can be disabled using   JAVA_OPTION  -Xint
.
Point-4). Some times the “Just In Time Compiler” code generation also causes the JVM Crash. In these scenarios In Case of Sun JDKs disabling the JIT Compiler can help. We can disable the JIT Compiler by adding the JVM Option:  ”-Djava.compiler=none
.
Point-5). Presence of a different bit version (32 bit or 64-bit libraries) of Library in the “-Djava.library.path
.
Point-6). Disabling the Application Servers Native Performance Packs. In WebLogic The Native IO Can be disabled using “-Dweblogic.NativeIOEnabled=false” JVM Option.
.
.
Thanks
Jay SenSharma

Basic JVM Tuning Tips

Hi All,
Jay SenSharma

Jay SenSharma

NOTE : Prerequisite of this post is that u are aware of the JVM Architecture and different generations of it…If not then Please refer to:  http://middlewaremagic.com/weblogic/?p=4464
And
.
Middleware performance is directly related to the JVM Tuning. Due to incorrect JVM Tuning we may face OutOfMemory, High CPU, StuckThread, Server Crash kind of issues. There is no standard value available for JVM Tuning which can be suggested so that a JVM can be called as 100% tuned because the JVM tuning totally depends on the Platform (Operating System), The Number of CPUs, The nature of Application (Like how many objects it creates? How Many Long Living Objects? How it implements Caching of Objects? …etc). But if we will keep following  things in mind while tuning the JVM then it may be really helpful.
.
Here are some very basic tips of tuning the JVM. Whatever Values of different JVM tuning parameters we are going to see and discuss in this Article is not an absolute value, it may vary according to your environmental setup and requirement.

Tip-1).  If Observed the GC time is Very Long.

If we observe that the Full Garbage Collection is taking a longer duration (Healthy JVM Usually takes around 2 to 2.5 Seconds for Full GC) then we must try following things:
.
Point-1). -Xincgc: Applying the “-Xincgc” JVM Option instructs the JVM to use the incremental Garbage Collection Strategy. Due to this option the Garbage Collector starts Garbage Collecting a fraction of heap at a time rather than Garbage Collecting the whole Heap space at once. So it reduces the long GC pauses.
.
Point-2). Apart from above we can even decide to decrease the Max Heap Size (If a very large heap is not required for us). Because if the Heap Size is very large then the Garbage Collector will take little longer duration to perform a full GC.
.
Point-3). Suppose if we dont have an option to decrease the Heap Memory then we can even think of adjusting the -XX:MaxNewSize (in JDK1.3 and JDK1.4) or  -Xmn (new name of Young Generation flag from JDK1.4 onwards). Increasing the -Xmn (Young Generation Area) helps in scenarios where the Application creates short living objects (less caching Applications). Because it inceases the time of minor GC, most of the application objects dies early.
.
Example-1:
Suppose the Maximum Heap is -Xmx1024m then we can choose the Young Generation EdenSpace as -Xmn340m… It means the 1/3 (one third of the Max Heap) Or in other way we can say that the Eden Space is One Third of the Maximum Heap.
Example-2:
It can also be achieved by setting the -XX:NewRatio. Because  -XX:NewRatio=2 means that 2:1 ratio of Tenured and Young Generation(Eden)

Tip-2).  Full GC Is Happening Very Frequently?

Many times some application calls System.gc() or Runtime.gc() to perform explicit Garbage Collection (Which is not recommended). Apart from this if an Application uses many Remote Method Invocation calls then in that case as well there are chances that the full GC may happen very frequently.  In this case we can try disabling the Explicit Garbage Collection statements using the JVM Option:  -XX:+DisableExplicitGC
Similarly If we dont require a very frequent RMI garbage collection then we can use the JVM Option:  -Dsun.rmi.dgc.client.gcInterval=1800000 ……     It means now the Explicit RMI Garbage collection will happen in 30 Minutes.

Tip-3).  OutOfMemoryError: unable to create new native thread?

If you see very frequently “java.lang.OutOfMemoryError: unable to create new native thread”. It usually happens if the -Xss Stack size is very large. Thread Stack size is a memory area where the Threads places their local variables and maintains the stack. Many applications which creates a lots of Threads which requires less memory for their local variables, still the JVM allocates a large memory StackSize for those threads…so after a certain number of thread creation JVM will not be allocate some more space for new Thread creation.
.
So it is recommended for these kind of environments that we decrease the StackSize because Sparc 32-bit JVM has -Xss512k as default value and 1024k Stack Size on Sparc 64-bit JVM. So if we want to create some more threads and if the Threads requires less memory stack then in that case we should try to decrease the StackSize.

Tip-4). How To Preserve Memory ?

We must disable loading some libraries in the memory of JVM if our application does not require them. Like if our Application does not require the JVMs Graphic Library then in that case we must diasble loading of Graphic libraries using the JVM Option  “-Djava.awt.headless=true”

Some More Tips….Keep Visiting Middleware Magic….
.
.
Thanks
Jay SenSharma

OutOfMemory Causes and First Aid Steps?

Hi All,
Jay SenSharma

Jay SenSharma

In response to the Comment: http://middlewaremagic.com/weblogic/?p=4456#comment-2026 We developed a Post on Java Heap. Thanks to “Swordfish” for querying us very important topic.
.
In most of the environments we usually see some kind of Memory related issues like OutOfMemory/Server Slowless kind of things. most of the times it is related to the in accurate tuning of the JVM and some times it happens due to the Application Frameworks Bug/ In accurate tuning configuration or it may happen due to the Application Code as well ..like Object Leakin in the Application code.
.
NOTE: The pre requisit for this Post is that you are already aware of different Memory Spaces available as part of the Java Process..If Not then Please quickly review: http://middlewaremagic.com/weblogic/?p=4456
.
Here we are going to see that what causes the OutOfMemory issues  and Why it happens along with some basic First Aid Steps to debug this kind of  issues.

What is OutOfMemory?

An OutOfMemory is a condition in which there is not enough space left for allocating required space for the new objects or libraries or native codes. OutOfMemory can be divided in tow main categories:

1). OutOfMemory in Java Heap:

This happens when the JVM is not able to allocate the required memory space for a Java Object. There may be many reasons behind this…like
Point-1). Very Less Heap Size allocation. Means setting the MaxHeapSize (-Xmx) parameter to a very less value.
.
Point-2). The Leaking of Objects. Either the Application is not unreferencing the unused Objects or the Third part frameworks (Hibernate/Spring/Seam…etc) might not be releasing the references of the objects due to some inaccurate configurations.
.
Point-3). In Many cases it may be the reason that Application codes are getting the JDBC connections objects from the DataSource are not being released back to the Connection Pool.
.
Point-4). Garbage Collection strategy may be in correct according to the environmental/application requirements.
.
Point-5). In-accurate setting of Application/Frameworks Cache.
Example:
Exception in thread "Thread-10" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(Abs tractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStr ingBuilder.java:390)
at java.lang.StringBuilder.append(StringBuilder.java: 119)
at java.lang.Throwable.toString(Throwable.java:344)

2). Native OutOfMemory:

Native OutOfMemory is a scenario when the JVM is not able to allocate the required Native Libraries and JNI Codes in the memory.
Native Memory is an area which is usually used by the JVM for it’s internal operations and to execute the JNI codes. The JVM Uses Native Memory for Code Optimization and for loading the classes and libraries along with the intermediate code generation.
The Size of the Native Memory depends on the Architecture of the Operating System and the amount of memory which is already commited to the Java Heap. Native memory is an Process Area where the JNI codes gets loaded or JVM Libraries gets loaded or the native Performance packs and the Proxy Modules gets loaded…
Native OutOfMemory can happen due to the following main reasons:
.
Point-1). Setting very small StackSize (-Xss). StackSize is a memory area which is allocated to individual threads where they can place their thread local objects/variables.
.
Point-2). Usually it may be seen because of Tuxedos incorrect setting. WebLogic Tuxedo Connectors allows the interoperability between the Java Applications deployed on WebLogic Server and the Native Services deployed on Tuxedo Servers. Because Tuxedos uses JNI code intensively.
.
Point-3). Less RAM or Swap Space.
Example: For details on this kind of error Please refer to: http://middlewaremagic.com/weblogic/?p=422
.
Point-4). Usually it may occur is our Application is using a very large number of JSPs in our application. The JSPs need to be converted into the Java Code and then need to be compiled. Which reqires DTD and Custom Tag Library resolution as well. Which usually consumes more native memory.
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:574)
at TestXss.main(TestXss.java:18)

3).  OutOfMemory in PermGen Space:

Permanent Generation is a Non-Heap Memory Area inside the JVM Space. Manytimes we see OutOfMemory in this Area. PermGen Area is NOT present in JRockit JVMs. For more details on this area please refer to: http://middlewaremagic.com/weblogic/?p=4456.
.
The PermGen Area is measured independently from the other generations because this is the place where the JVM allocates Classes, Class Structures, Methods and Reflection Objects. PermGen is a Non-Heap Area.It means we DO NOT count the PermGen Area as part of Java Heap.
The OutOfMemory in PermGen Area can be seen because of the following main reasons:
Point-1). Deploying and Redeploying a very Large Application which has many Classes inside it.
.
Point-2). If an Application is getting deployed/Updated/redeployed repeatedly using the Auto Deployment feature of the Containers. In that case the Classes belonging to the application stays un cleaned and remains in the PermGen Area without Class Garbage Collection.
.
Point-3). If  ”-noclassgc” Java Option is added while starting the Server. In that case the Classes instances which are not required will not be Garbage collected.
.
Point-4). Very Less Space for allocated the “=XX:MaxPermGen”
Example: you can see following kind of Trace in the Server/Stdout Logs:
<Notice> <Security> <BEA-090171> <Loading the identity certificate and private key stored under the alias DemoIdentity from the jks keystore file D:\ORACLE\MIDDLE~1\WLSERV~1.3\server\lib\DemoIdentity.jks.>
Exception in thread "[STANDBY] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)

What to do in case of OutOfMemory In JavaHeap?

Whenever we see an OutOfMemory in the server log or in the stdout of the server. We must try to do the following things as first aid steps:
.
Point-1). If possible enable the following JAVA_OPTIONS in the server start Scripts to get the informations of the Garbage Collection status.
-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails  -Xloggc:/opt/app/GCLogsDirectory/gc.log
.
Point-2). It is always needed to see what all objects were present when the OutOfMemory error occured to identify whether those objects belongs to the Application Code/ Application Framework Codes/ The Application Server APIs. Sothat we can isolate the issue. In order to get the details of the Heap Objects collect “HeapDump” either using JHat (not a better tool) or JMap (Much Better compared to the Jhat tool). Plese refer to the post to know how to do it : http://middlewaremagic.com/weblogic/?p=2241
.
Point-3). Once we collected the Heap Dump we can easily monitor the Heap Details using best GUI toold like “Jhat Web Browser” or using “Eclipse Memory Analyzer”.

What to do in case of Native OutOfMemory?

Point-1). Usually Native OutOfMemory causes Server/JVM Crash. So it is always recommended to apply the following JAVA_OPTIONS flags in the Server Start Script to instruct the JVM to generate the HeapDump  ”-XX:+HeapDumpOnOutOfMemoryError
By default the heap dump is created in a file called java_pidpid.hprof in the working directory of the VM, as in the example above. You can specify an alternative file name or directory with the “-XX:HeapDumpPath=C:/someLocation/
.
Note: Above Flags are also suitable to collect HeapDump in case of JavaHeap OutOfMemory as well. But these flags never gurantees that the JVM will always generate the Heap Dump in case of any OutOfMemory Situation.
.
Point-2). Usually in case of Native OutOfMemory a “hs_err_pid.log” file is created in case of Sun JDK and “xxxx.dump” file is created in case of JRockit JDK. These log files are usually Text Files and tells about the Libraries which caused the Crash. These files need to be collected and analyzed to find out the root cause.
.
Point-3). Make Sure that the -XX:MaxHeapSize is not set to a Very Large Space…because it will cause a very less Native Space allocation. Because as soon as we increase the HeapSize, the Native Area decreases. Please see the Post: http://middlewaremagic.com/weblogic/?p=4456
.
Point-4). Keep Monitoring the process’s memory using the Unix utility ‘ps’ like following:
ps -p <PID> -o vsz
Here you need to pass the WebLogic Server’s PID (Process ID) to get it’s Threading Details with respect to the Virtual Memory Space.
.
Point-5). If the Heap Usages is less Or if you see that Your Application usages less Heap Memory then it is always better to reduls the MaxHeapSize  so that the Native Area will automatically gets increased.
.
Point-6). Sometimes the JVMs code optimization causes Native OutOfMemory or the Crash…So in this case we can disable the Code Optimization feature of JVM.
(Note: disabling the Code Optimization of JVM will decrease the Performance of JVM)
For JRockit JVM Code Optimization can be disabled using JAVA_OPTION  -Xnoopt
For Sun JDK Code Optimization can be disabled using   JAVA_OPTION  -Xint

What to do in case of OutOfMemory In PermGen?

Point-1). Make Sure that the PermGen Area is not set to a very less value.
.
Point-2). Usually if an Application has Many JSP Pages in that case every JSP will be converted to a *.class file before JSP Request Process. So a large number of JSPs causes generation of a Large number of *.class files all these classes gets loaded in the PermGen area.
.
Point-3). While allocating the -XX:MaxPermSize make sure that you follow a rough Formula… which works in most of the Application Servers.
MaxPermSize = (Xmx/3) —- Very Special Cases (One Third of maximum Heap Size)
MaxPermSize = (Xmx/4) —- Recommended (One Fourth Of maximum Heap Size)
Point-4). If you are repeatedly getting the OutOfMemory in PermGen space then it could be a Classloader leak….
May be some of the classes are not being unloaded from the permgen area of JVM . So please try to increase the -XX:MaxPermSize=512M  or little more and see if it goes away.
If not then add the following JAVA_OPTIONS to trace the classloading and unloading to find out the root cause :
-XX:+TraceClassloading and -XX:+TraceClassUnloading
.
.
Thanks
Jay SenSharma

Parts Of JVM And JVM Architecture Diagram?

Hi All,
Jay SenSharma

Jay SenSharma

JVM is the heart of any Java based Application Server. We face most of the issues deu to incorrect JVM tunning. It is very important to understand the Overall architecture of the JVM in order to trouble shoot different JVM tunning related issues.Here we are going to discuss the Architecture and the Major parts of a Java Process And the Java Heap Division.
.
The Following Diagram is just a basic overview of a Java Process in a 2 GB process Size Machine. Usually in 32 bit Windows Operating Systems the default process size will be 2 GB (In Unix based 64 bit operating Systems it can be 4GB or more). So i draw the following Diagram of Java Process to explain the Java Process partitions in a 2Gb process size machine.
Java Process Architecture Diagram

Java Process Architecture Diagram

In the above diagram we will find different partitions of a Java Process. Please compare the above diagram with below descriptions.
.
1). Just for Example we can see that Process Size is 2048 MB (2GB)
2). The Java Heap Size is 1024MB (means 1GB)   -Xmx1024m
3). Native Space = ( ProcessSize – MaxHeapSize – MaxPermSize) It means around 768 MB of Native Space.
4). MaxPermSpace is around -XX:MaxPermSize=256m
5). Young Generation Space is around    40% of Maximum Java Heap.

What Are these Different Parts?

Eden Space:

Eden Space is a Part of Java Heap where the JVM initially creates any objects, where most objects die and quickly are cleanedup by the minor Garbage Collectors (Note: Full Garbage Collection is different from Minor Garbage Collection). Usually any new objects created inside a Java Method go into Eden space and the objects space is reclaimed once the method execution completes. Where as the Instance Variables of a Class usually lives longer until the Object based on that class gets destroyed. When Eden fills up it causes a minor collection, in which some surviving objects are moved to an older generation.

Survivor Spaces:

Eden Sapce has two Survivor spaces. One survivor space is empty at any given time. These Survivor Spaces serves as the destination of the next copying collection of any living objects in eden and the other survivor space.
The parameter SurvivorRatio can be used to tune the size of the survivor spaces.
-XX:SurvivorRatio=6 sets the ratio between each survivor space and eden to be 1:6
If survivor spaces are too small copying collection overflows directly into the tenured generation.

Young Generation: (-XX:MaxNewSize)

Till JDK1.3 and 1.4 we used to set the Young Generation Size using -XX:MaxNewSize. But from JDK1.4 onwards we set the YoungGeneration size using (-Xmn) JVM option.
Young Generation size is controlled by NewRatio.  It means setting -XX:NewRatio=3 means that the ratio between the Old Generation and the Young Generation is  1:3
.
Similarly -XX:NewRatio=8 means that 8:1 ratio of tenured and young generation.
NewRatio: NewRatio is actually the ratio between the (YoungGenaration/Old Generations) has default values of 2 on Sparc , 12 on client Intel, and 8 everywhere else.
NOTE: After JDK 1.4 The Young Generation Size can be set using  (-Xmn) as well.

Virtual Space-1: (MaxNewSize – NewSize)

The First Virtual Space is actually shows the difference between the -XX:NewSize and -XX:MaxNewSize.  Or we can say that it is basically a difference between the Initial Young Size and the Maximum Young Size.

Java Heap Area: (-Xmx and -Xms)

Java Heap is a Memory area inside the Java Process which holds the java objects.  Java Heap is a combination of Young Generation Heap and Old Generation Heap. We can set the Initial Java Heap Size using -Xms JVM parameter similarly if we want to set the Maximum Heap Size then we can use -Xmx JVM parameter to define it.

Example:
-Xmx1024m —> Means Setting the Maximum limit of Heap as 1 GB
-Xms512m —> Means setting Java Heap Initial Size as 512m
.
NOTE-1): It is always recommended to set the Initial and the Maximum Heap size values as same for better performance.
NOTE-2): The Theoretical limitation of Maximum Heap size for a 32 bit JVM is upto 4GB. Because of the Memory Fragmentation, Kernel Space Addressing, Swap memory usages and the Virtual Machine Overheads are some factors JVM does not allow us to allocate whole 4GB memory for Heap in a 32 bit JVM. So usually on 32-bit Windows Operating Systems the Maximum can be from 1.4 GB to 1.6 GB.
.
If we want a Larger memory allocation according to our application requirement then we must choose the 64-bit operating systems with 64 bit JVM. 64-bit JVM provides us a larger address space. So we can have much larger Java Heap  with  the increased number of Threads allocation area. Based on the Nature of your Operating system in a 64 bit JVM you can even set the Maximum Heap size upto 32GB.
Example:        -Xms32g -Xmx32g -Xmn4g

Virtual Space-2: (MaxHeapSize – InitialHeapSize)

The Second Virtual Space is actually the Difference between the Maximum Heap size (-Xmx)and the Initial Heap Size(-Xms). This is called as virtual space because initially the JVM will allocate the Initial Heap Size and then according to the requirement the Heap size can grow till the MaxHeapSize.
.

PermGen Space: (-XX:MaxPermSize)

PermGen is a non-heap memory area where the Class Loading happens and the JVM allocates spaces for classes, class meta data,  java methods and the reference Objects here. The PermGen is independent from the Heap Area. It can be resized according to the requirement using -XX:MaxPermSize and -XX:PermSize  JVM Options. The Garbage collection happens in this area of JVM Memory as well. The Garbage collection in this area is called as “Class GC”. We can disable the Class Garbage Collection using the JVM Option -noclassgc. if  ”-noclassgc” Java Option is added while starting the Server. In that case the Classes instances which are not required will not be Garbage collected.

Native Area:

Native Memory is an area which is usually used by the JVM for it’s internal operations and to execute the JNI codes. The JVM Uses Native Memory for Code Optimization and for loading the classes and libraries along with the intermediate code generation.
The Size of the Native Memory depends on the Architecture of the Operating System and the amount of memory which is already commited to the Java Heap. Native memory is an Process Area where the JNI codes gets loaded or JVM Libraries gets loaded or the native Performance packs and the Proxy Modules gets loaded.
There is no JVM Option available to size the Native Area. but we can calculate it approximately using the following formula:
NativeMemory = (ProcessSize – MaxHeapSize – MaxPermSize)
.
.
Thanks
Jay SenSharma

What Is Server Hang And What Need To Be Done ?

Hi All,

Jay SenSharma

Jay SenSharma

Thread dump analysis is the most important part to find out the Server slow responsiveness or Hang Server Situation or sometimes a Crash scenario as well if happened because of the Stuck threads. Here we are going to see some very basic but important features of  WebLogic Thread Model and the functionality and tasks performed by these Threads. Here we will talk about some very common terminology which we use while analyzing the Thread Dump or Hang Server Situation.

What is Hang Server Situation and It’s Symptoms?

1). If a server responses very slow compared to the estimated time of response then it may be moving to total Hang Server Situation.

2). If a Server does not even response to the clients requests, then it is a Complete Server Hang scenario. Some times the complete stuckness of a Server also may cause a Server Crash.

Roles & Responsibilities  Of WebLogic Threads?

The WebLogic Threads can be broadly categorized into 2 Main categories.
1). WebLogic Execute Threads: These Threads are responsible for processing the clients/users requests. This decides that how many tasks a server can be performed in parallel.  By default in development mode the WLS Server will have 15 Execute Threads and if we run the Server in Production mode then the WLS Server will have 25 Execute Threads as Default. Increasing the Execute Thread count does not mean increase in performance…rather in some cases it may even degrade the performance.
2). WebLogic Socket Reader Threads: These threads listens to the incoming clients requests, Means basically these Threads deals with the Network traffic part.  These are basically a percentage of Execute Threads. The Default ration between is Execute Threads and Socket Reader Threads is 33%. Means 33% Threads will be Socket Readers as default. Which can be tuned according to the requirement.
Example: if we have 15 Execute Threads in Development Mode of WLS Server then it simply means that around 33% of these threads will work as Socket Reader Threads and rest Threads will be processing the Clients request.

What are Execute Queues?

Execute Queue is a group of Execute Threads which will be taking care of designated Servlets / RMI Ojects/ EJBs/ JSPs…etc. Till WLS8.1 we could see these Execute Queues informations  available as part of “config.xml” file entry. But from WLS 9.x onwards as the WebLogic Threading Model is changed because of the introduction of WorkManagers, we wont see the Execute Queues details by default in WLS9.x and later WLS. But still if we want to use the WLS8.1 style of Threading model in WLS9.x or Later versions of WebLogic then we can use the  the following JAVA_OPTION :  -Dweblogic.Use81StyleExecuteQueues=true

Possible Cause of Server Hang?

There may be many reasons behind Server Slow responsiveness or Hang scenario…

Cause 1). If the Free Heap memory is very less then the the Threads will NOT be able to create required objects in the Java Heap.

Cause 2). Insufficient number of Threads. It happens some times that if the Load (number of users request) on Server suddenly increased and the MaxThreadCount is not set to a correct value then Server will not be able to process these many requests.

Cause 3). If the Garbage Collection is taking much time in that case the Garbage Data clean up process will take longer time and the Threads will be doing Garbage Collection rather than processing the clients request. Or the Threads will be waiting for some free memory to create some objects in the heap.

Cause 4). Sometimes Java Code optimization also causes a temporary hang scenario, because the code optimization is a little heavy process but useful for better performance.

Cause 5). Many Remote Jdbc Lookups can sometimes cause the Hang scenario.

Cause 6). In accurate JSP compilation settings. (recommended always precompile the JSPs before deploying it to the production environments and the  PageCheckSeconds must be set correctly sothat the JSP compilation will not happen very frequently)

Cause 7). Application code Deadlock or Jdbc Driver Deadlocks or Vendor API Bug: When threads waits in infinite loop to gain lock on objects… Example scenario below
Example:
1).  Thread_A has Gained Lock on Object Obj_A
2).  Thread_B Gained Lock on Object Obj_B
3).  Now After performing some operations on Obj_A the Thread_A  is trying to get Lock on Obj_B (Obj_B is already locked by Thread_B currently)
4).  Similarly after performing some operations on Obj_B the Thread_B wants to gain Lock on Obj_A (Obj_A is already locked by Thread_A)

In Above Scenario Now Both the Threads are waiting for each other to release the Lock on their object … but none of them is actually releasing the lock from the Objects which they Already have Locked.

Cause 8). If the Number of File Descriptors are very less (insufficient resources) in this case also we might face slow server response or Server Hang situations.

How the Server Log will Look like in case of Hang Scenarios or Slow Responsiveness?

By Default 600 Seconds (10 Minutes) is the default duration after which the WebLogic Server declares a Thread as a STUCK Thread. The Entry of Stuck Thread occurance can be seen in the Server Logs. You can see following kind of entry in the Server Logs…

<Warning> <WebLogicServer> <BEA-000337> <ExecuteThread: '7' for queue: 'weblogic.kernel.Default' has been busy for "630" seconds working on the request "weblogic.ejb20.internal.JMSMessagePoller@d64412", which is more than the configured time (StuckThreadMaxTime) of "600" seconds.>

Does the Above kind of Entry in Server Log means WebLogic is Hang?

Above message in server Log is just an indication that some Threads are taking longer time to process some requests which may lead to a Hang Server Situation and of course Slow responsiveness of WebLogic Server. But it doesn’t mean 100% that WebLogic cannot recover these Threads.

WebLogic has a capability to Declare a STUCK thread as UNSTUCK. Weblogic would priodically check for stuck threads based on settings of “Stuck Thread Max Time” and “Stuck Thread Timer Interval”  http://e-docs.bea.com/wls/docs103/ConsoleHelp/taskhelp/tuning/TuningExecuteThreads.html.

Weblogic will just report the thread as stuck and in case the thread progresses(may be it seemed to be stuck but was actually running a long transaction), weblogic would declare it as unstuck.
Many times it happens that our Application requirement says that a Thread can take more time to process the Clients request (more than 600 Seconds) In that case we can change the “StuckThreadMaxTime” interval.  Example if we know that our Application has some long running Jdbc queries which may take upto 900 Seconds then we can increase the StuckThreadMaxTime.

What First Aid Steps Required to Collect Debug Data ?

Debugging-1). Try to ping the WebLogic Server 5-6 times using “weblogic.Admin PING” utility to see how quickly we are getting the response back?

java weblogic.Admin -url t3://StuckThreadHostName:9001  -username weblogic -password weblogic  PING

Debugging-2). As soon as you get time check the Verbose GC Logs (Garbage Collection Logs if you have already enabled the Garbage Collection Logging by applying the following JAVA_OPTIONS)

set JAVA_OPTIONS=%JAVA_OPTIONS%  -Xloggc:/opt/logs/gc.log -XX:+PrintGCDetails -Xmx1024m -Xms1024m

Debugging-3). Collect at least 4-5 Thread Dumps taken in the interval of 9-10 seconds each to see the Activities of Threads. Follow the various ways of collecting Thread Dumps:  http://middlewaremagic.com/weblogic/?p=823

Debugging-4). Check If the Load (Number of Clients Request) is/was abnormally very High on the Server?

Debugging-5). Check if the JMS Subsystem connectivity or the Database connectivity is lost somewhere… You may find in the Server Logs some Strange entries like DataSource is Disables/ Network Adapter could not Establish Connection to the Database/ JMS Messaging System “PeerGoneException” ….. Tuxedo/Jolt Connectivity Errors…etc.
.

.

Thanks

Jay SenSharma


High CPU Analysis in AIX

Hi All,
Jay SenSharma

Jay SenSharma

In response to the comment: http://middlewaremagic.com/weblogic/?p=958&cpage=1#comment-1971
I got an idea to develop a post on this. It is very interesting to start High CPU Analysis on AIX Operating system, because it doesnt have common and useful utilities like “prstat” or “top” unlike Solaris/Linux Operating Systems.
But if we want to analyze the High CPU in AIX operating system then we can use the following utilities:

1) dbx: The dbx utility can be used to examine corefiles as well as executable programs. By examining a core memory image file, you can use the dbx command to examine the state of the program when it faulted.

2) ps: The ps command writes the status of active processes and if the -m flag is given, displays the associated kernel threads to standard output. While the -m flag displays threads associated with processes using extra lines, you must use the -o flag with the THREAD field specifier to display extra thread-related columns.

.
So lets see how to how to proceed with the AIX High CPU analysis.
.

Step1). Open a AIX Shell Prompt and then find out the WebLogic Server Process ID(PID).

.

Step2). Now run the following command to findout the Thread information which is consuming maximum CPU cycles: (Here PID=weblogic Server Process ID)
ps  -mp <PID>  -o  THREAD
Suppose if the WebLogic Servers Process ID is = 23456 then please run the following command:
Example:
ps   -mp  23456  -o  THREAD
As soon as the above command lists the Thread details Notice the Note down TID value in the List which has high number of CPU. Suppose the TID is ’11111′ then u note it down becasue u will need to use it in Step 4 for analysis.

.

Step3).Take 4-5 Thread Dumps of WebLogic Server in the interval of 10-12 seconds.
kill -3 <PID>
So far we have collected the raw dat, Now we will start Investigating :)

.

Step4). Run the dbx utility with -a with the sub command ‘thread’ to list down all the Threads of WLS. Example suppose if the WebLogic Servers Process ID is = 23456 then please run the following command:
$ dbx -a 23456
(dbx) thread
Somewhere in the output if the above command u will find a line like :
$t15 run running 11111 k no sys JVM_Send
In the above line t15 means the thread number 15  and 11111 is the TID of the thread which we noticed in Step2).

.

Step5). Now in the dbx window again run the following command to query for the thread 15…because Thread 15 information we got in previous command.
(dbx) th info 15

.

Step6). In the output of the above command please find the value of  ’pthread_t’  attribute, which is actually Native Thread Number.
Suppose the ‘pthread_t’ attribute value is ’3333′ The u need to find the value 3333 in the WebLogic Thread Dump to findout the Culprit Thread which is causing the High CPU.
In the thread dump u will find something like:
Execute Thread: '5' for queue 'default'  (TID:0x21df97a8, sys_thread_t:0x3a5bea108, state:R, native ID:0x3333) prio=5 at java.net.SocketOutputStream.socketWrite(Native Method)........

.

Step7). Remember to detach the ‘dbx’ utility with the WebLogic using the ‘detach’ command in the dbx prompt.
.
.
Thanks
Jay SenSharma

  • Testimonials

  • RSS Middleware Magic – JBoss

  • Receive FREE Updates


    FREE Email updates of our new posts Enter your email address:



  • Magic Archives

  • Sitemeter Status

  • ClusterMap 7-Nov-2011 till Date

  • ClusterMap 6-Nov-2010 till 7-Nov-2011

  • Copyright © 2010-2012 Middleware Magic. All rights reserved. | Disclaimer