Debug

Causes and First Aid of JVM Crash Issues?

Hi All,
Jay SenSharma

Jay SenSharma

Java Virtual Machine is a Native engine which allows our Java Applications to run. It performs the code optimization to improve the performance. In correct tuning, Low memory allocation, extensive code optimization, bad garbage collection strategy, API code leaking…etc are some of the reasons which may cause the JVM to crash.
.
Analyzing a JVM Crash is one of the very interesting and little time taking process sometimes it is even little complex to find out the root cause of the JVM Crash. Here in this article we will see some of the common mistakes, first aid solutions/debugging techniques to find out what kind of informations we can get by looking into the Core Dump.

What is Core Dump & Where to Find It?

Code dump is usually a Binary file which gets generated by the Operating System when JVM or any other process crashes. Sometimes it also happens that the JVM will not be able to generate the Crash dump. In Windows Operating Systems it will be generated in the Directore where the “Dr. Watson” tool is installed. In Windows it will be usually:  “C:Documents and SettingsAll UsersApplication DataMicrosoftDr Watson
.
By default in Unix based Operating Systems the Core Dump files are created in the directory where the Java Program/Server was started even sometimes it is generated in the “/tmp” directory of the Operating System. But using the following Java Option we can change it’s the Crash Dump/Heap   Dump generation locations:  -XX:HeapDumpPath=/opt/app/someLocaton/ and  -XX:+HeapDump JVM Options.
.
NOTE: These Flags does not gurantee that always the Heap/Crash dump will be generated at the time of JVM Crash. There are some more reasons behind the Core Dump not gets generated…like Process Limitations or the Less Disk Quota or unavailability of the Free File Descriptors.

Who Generates the Crash/Core Dump?

JVM does not generate the Coe dump. Rather it is the Operating System which generates the Core Dump. Core Dump is a Binary file which may be several hundred Mega Bytes or Giga Bytes in size. The Operating systems just log the exception/error messages and the details of the Threads along with the Native libraries loaded with that java process.
.
Many times a brief Textual Crash file is also generated by the JVM itself sometimes during Crash. Usually the file name is “hs_err_pid<WebLogicPID>.log” in case of Sun JDK. Similarly JRockit JVM also generates a Textual file with name “*.dump” in case of JVM Crash.

Use of -XX:+ShowMessageBoxOnError?

The Thread Dump is also very helpful to analyze the Server Crash. Thread dump tells us what was the status and activities performed by the Threads at the time of crash. It may be possible to get a thread dump before the process exits. HotSpot supports the JAVA_OPTIONS -XX:+ShowMessageBoxOnError
.
The corresponding JRockit JVM Option is -Djrockit.waitonerror.  When the JVM is crashing, it may prompt the user ” Do you want to debug the problem? ” This pauses the process, thereby creating an opportunity to generate a thread dump (a stack trace of every thread in the JVM), attach a debugger, or perform some other debugging activity.  However, this does not work in all cases (for eg., in case of stack overflow).
.
Along with above there are various options available to get the Thread Dumps as described in : http://middlewaremagic.com/weblogic/?p=823

What May Cause JVM Crash?

Reason-1). Usually Native Code causes the JVM Crash. Native code is a code written in Languages like C/C++, Java Native Interface APIs (JNI).
.
Reason-2). JDBC Drivers specially the Native Drivers.
.
Reason-3). JVM Code Optimization.
.
Reason-4). Less Memory availabity for Native Area of a Java Process.
.
Reason-5). Application Servers Native Performance Pack Libraries.
.
Reason-6). JVMs library itself can cause the Crash.
.
Reason-7). High CPU Utilization by the Threads. As described in : http://middlewaremagic.com/weblogic/?p=4348
.
Reason-8). Presence of Wrong Native Libraries in the PATH or in the “-Djava.library.path
.
Reason-9). Presence of A Different Version of Libraries in “java.library.path” or in “SHLIB” or in “LD_LIBRARY_PATH” variables. Like setting a 64-bit version of Library in a 32-bit version of JVM’s library path or vise-versa.
.
.

Tools To Analyze the Core Dumps?

Core/Crash dump is Operating System specific, So to analyze these Dumps we must use the Tools provided by the same Operating System vendors. Various kind of tools are provided by the Operating Systems to analyze these Core Dumps like:
Tool-1). Dr. Watson Tool  in Windows OS. Windows OS Start (Button)—>Run…—>drwtsn32
.
Tool-2). “pstack” and “pmap” in Solaris Operating System.
.
Tool-3). “procstack” and “procmap” in AIX Operating System.
.
Tool-4). “lsstack” and “pmap” in Linux Operating System.
.
Tool-5). “pflag” if available in HP-UX Operating System.
.
.

What May Help To Avoid JVM Crash?

It totally depends on What Caused the Crash or What Libraries caused the JVM crash to avoid the occurance of the JVM crash for next time. But following things should be taken in consideration while analyzing and avoiding the Crash.
.
Point-1). If the Native Jdbc Driver is causing issues, If our appliation is using the Native JDBC Drivers then Switching from Pure Native Jdbc Driver (Type-2 Drivers) to the Pure Java JDBC Driver (Type-4 Driver) may help.
.
Point-2). If the JVM Libraries are causing the Crash then Upgrading to a Later Version of the JDK. If that Application Server has that new JDK in it;s Supported Configuration list.
.
Point-3). If the JVM Code Optimization is causing the Crash, then Disabling the Code Optimization of JVM by applying the JVM Options.
For JRockit JVM Code Optimization can be disabled using JAVA_OPTION  -Xnoopt
For Sun JDK Code Optimization can be disabled using   JAVA_OPTION  -Xint
.
Point-4). Some times the “Just In Time Compiler” code generation also causes the JVM Crash. In these scenarios In Case of Sun JDKs disabling the JIT Compiler can help. We can disable the JIT Compiler by adding the JVM Option:  “-Djava.compiler=none
.
Point-5). Presence of a different bit version (32 bit or 64-bit libraries) of Library in the “-Djava.library.path
.
Point-6). Disabling the Application Servers Native Performance Packs. In WebLogic The Native IO Can be disabled using “-Dweblogic.NativeIOEnabled=false” JVM Option.
.
.
Thanks
Jay SenSharma

Basic JVM Tuning Tips

Hi All,
Jay SenSharma

Jay SenSharma

NOTE : Prerequisite of this post is that u are aware of the JVM Architecture and different generations of it…If not then Please refer to:  http://middlewaremagic.com/weblogic/?p=4464
And
.
Middleware performance is directly related to the JVM Tuning. Due to incorrect JVM Tuning we may face OutOfMemory, High CPU, StuckThread, Server Crash kind of issues. There is no standard value available for JVM Tuning which can be suggested so that a JVM can be called as 100% tuned because the JVM tuning totally depends on the Platform (Operating System), The Number of CPUs, The nature of Application (Like how many objects it creates? How Many Long Living Objects? How it implements Caching of Objects? …etc). But if we will keep following  things in mind while tuning the JVM then it may be really helpful.
.
Here are some very basic tips of tuning the JVM. Whatever Values of different JVM tuning parameters we are going to see and discuss in this Article is not an absolute value, it may vary according to your environmental setup and requirement.

Tip-1).  If Observed the GC time is Very Long.

If we observe that the Full Garbage Collection is taking a longer duration (Healthy JVM Usually takes around 2 to 2.5 Seconds for Full GC) then we must try following things:
.
Point-1). -Xincgc: Applying the “-Xincgc” JVM Option instructs the JVM to use the incremental Garbage Collection Strategy. Due to this option the Garbage Collector starts Garbage Collecting a fraction of heap at a time rather than Garbage Collecting the whole Heap space at once. So it reduces the long GC pauses.
.
Point-2). Apart from above we can even decide to decrease the Max Heap Size (If a very large heap is not required for us). Because if the Heap Size is very large then the Garbage Collector will take little longer duration to perform a full GC.
.
Point-3). Suppose if we dont have an option to decrease the Heap Memory then we can even think of adjusting the -XX:MaxNewSize (in JDK1.3 and JDK1.4) or  -Xmn (new name of Young Generation flag from JDK1.4 onwards). Increasing the -Xmn (Young Generation Area) helps in scenarios where the Application creates short living objects (less caching Applications). Because it inceases the time of minor GC, most of the application objects dies early.
.
Example-1:
Suppose the Maximum Heap is -Xmx1024m then we can choose the Young Generation EdenSpace as -Xmn340m… It means the 1/3 (one third of the Max Heap) Or in other way we can say that the Eden Space is One Third of the Maximum Heap.
Example-2:
It can also be achieved by setting the -XX:NewRatio. Because  -XX:NewRatio=2 means that 2:1 ratio of Tenured and Young Generation(Eden)

Tip-2).  Full GC Is Happening Very Frequently?

Many times some application calls System.gc() or Runtime.gc() to perform explicit Garbage Collection (Which is not recommended). Apart from this if an Application uses many Remote Method Invocation calls then in that case as well there are chances that the full GC may happen very frequently.  In this case we can try disabling the Explicit Garbage Collection statements using the JVM Option:  -XX:+DisableExplicitGC
Similarly If we dont require a very frequent RMI garbage collection then we can use the JVM Option:  –Dsun.rmi.dgc.client.gcInterval=1800000 ……     It means now the Explicit RMI Garbage collection will happen in 30 Minutes.

Tip-3).  OutOfMemoryError: unable to create new native thread?

If you see very frequently “java.lang.OutOfMemoryError: unable to create new native thread”. It usually happens if the -Xss Stack size is very large. Thread Stack size is a memory area where the Threads places their local variables and maintains the stack. Many applications which creates a lots of Threads which requires less memory for their local variables, still the JVM allocates a large memory StackSize for those threads…so after a certain number of thread creation JVM will not be allocate some more space for new Thread creation.
.
So it is recommended for these kind of environments that we decrease the StackSize because Sparc 32-bit JVM has -Xss512k as default value and 1024k Stack Size on Sparc 64-bit JVM. So if we want to create some more threads and if the Threads requires less memory stack then in that case we should try to decrease the StackSize.

Tip-4). How To Preserve Memory ?

We must disable loading some libraries in the memory of JVM if our application does not require them. Like if our Application does not require the JVMs Graphic Library then in that case we must diasble loading of Graphic libraries using the JVM Option  “-Djava.awt.headless=true”

Some More Tips….Keep Visiting Middleware Magic….
.
.
Thanks
Jay SenSharma

OutOfMemory Causes and First Aid Steps?

Hi All,
Jay SenSharma

Jay SenSharma

In response to the Comment: http://middlewaremagic.com/weblogic/?p=4456#comment-2026 We developed a Post on Java Heap. Thanks to “Swordfish” for querying us very important topic.
.
In most of the environments we usually see some kind of Memory related issues like OutOfMemory/Server Slowless kind of things. most of the times it is related to the in accurate tuning of the JVM and some times it happens due to the Application Frameworks Bug/ In accurate tuning configuration or it may happen due to the Application Code as well ..like Object Leakin in the Application code.
.
NOTE: The pre requisit for this Post is that you are already aware of different Memory Spaces available as part of the Java Process..If Not then Please quickly review: http://middlewaremagic.com/weblogic/?p=4456
.
Here we are going to see that what causes the OutOfMemory issues  and Why it happens along with some basic First Aid Steps to debug this kind of  issues.

What is OutOfMemory?

An OutOfMemory is a condition in which there is not enough space left for allocating required space for the new objects or libraries or native codes. OutOfMemory can be divided in tow main categories:

1). OutOfMemory in Java Heap:

This happens when the JVM is not able to allocate the required memory space for a Java Object. There may be many reasons behind this…like
Point-1). Very Less Heap Size allocation. Means setting the MaxHeapSize (-Xmx) parameter to a very less value.
.
Point-2). The Leaking of Objects. Either the Application is not unreferencing the unused Objects or the Third part frameworks (Hibernate/Spring/Seam…etc) might not be releasing the references of the objects due to some inaccurate configurations.
.
Point-3). In Many cases it may be the reason that Application codes are getting the JDBC connections objects from the DataSource are not being released back to the Connection Pool.
.
Point-4). Garbage Collection strategy may be in correct according to the environmental/application requirements.
.
Point-5). In-accurate setting of Application/Frameworks Cache.
Example:
Exception in thread "Thread-10" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(Abs tractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStr ingBuilder.java:390)
at java.lang.StringBuilder.append(StringBuilder.java: 119)
at java.lang.Throwable.toString(Throwable.java:344)

2). Native OutOfMemory:

Native OutOfMemory is a scenario when the JVM is not able to allocate the required Native Libraries and JNI Codes in the memory.
Native Memory is an area which is usually used by the JVM for it’s internal operations and to execute the JNI codes. The JVM Uses Native Memory for Code Optimization and for loading the classes and libraries along with the intermediate code generation.
The Size of the Native Memory depends on the Architecture of the Operating System and the amount of memory which is already commited to the Java Heap. Native memory is an Process Area where the JNI codes gets loaded or JVM Libraries gets loaded or the native Performance packs and the Proxy Modules gets loaded…
Native OutOfMemory can happen due to the following main reasons:
.
Point-1). Setting very small StackSize (-Xss). StackSize is a memory area which is allocated to individual threads where they can place their thread local objects/variables.
.
Point-2). Usually it may be seen because of Tuxedos incorrect setting. WebLogic Tuxedo Connectors allows the interoperability between the Java Applications deployed on WebLogic Server and the Native Services deployed on Tuxedo Servers. Because Tuxedos uses JNI code intensively.
.
Point-3). Less RAM or Swap Space.
Example: For details on this kind of error Please refer to: http://middlewaremagic.com/weblogic/?p=422
.
Point-4). Usually it may occur is our Application is using a very large number of JSPs in our application. The JSPs need to be converted into the Java Code and then need to be compiled. Which reqires DTD and Custom Tag Library resolution as well. Which usually consumes more native memory.
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:574)
at TestXss.main(TestXss.java:18)

3).  OutOfMemory in PermGen Space:

Permanent Generation is a Non-Heap Memory Area inside the JVM Space. Manytimes we see OutOfMemory in this Area. PermGen Area is NOT present in JRockit JVMs. For more details on this area please refer to: http://middlewaremagic.com/weblogic/?p=4456.
.
The PermGen Area is measured independently from the other generations because this is the place where the JVM allocates Classes, Class Structures, Methods and Reflection Objects. PermGen is a Non-Heap Area.It means we DO NOT count the PermGen Area as part of Java Heap.
The OutOfMemory in PermGen Area can be seen because of the following main reasons:
Point-1). Deploying and Redeploying a very Large Application which has many Classes inside it.
.
Point-2). If an Application is getting deployed/Updated/redeployed repeatedly using the Auto Deployment feature of the Containers. In that case the Classes belonging to the application stays un cleaned and remains in the PermGen Area without Class Garbage Collection.
.
Point-3). If  “-noclassgc” Java Option is added while starting the Server. In that case the Classes instances which are not required will not be Garbage collected.
.
Point-4). Very Less Space for allocated the “=XX:MaxPermGen”
Example: you can see following kind of Trace in the Server/Stdout Logs:
<Notice> <Security> <BEA-090171> <Loading the identity certificate and private key stored under the alias DemoIdentity from the jks keystore file D:ORACLEMIDDLE~1WLSERV~1.3serverlibDemoIdentity.jks.>
Exception in thread "[STANDBY] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)

What to do in case of OutOfMemory In JavaHeap?

Whenever we see an OutOfMemory in the server log or in the stdout of the server. We must try to do the following things as first aid steps:
.
Point-1). If possible enable the following JAVA_OPTIONS in the server start Scripts to get the informations of the Garbage Collection status.
-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails  -Xloggc:/opt/app/GCLogsDirectory/gc.log
.
Point-2). It is always needed to see what all objects were present when the OutOfMemory error occured to identify whether those objects belongs to the Application Code/ Application Framework Codes/ The Application Server APIs. Sothat we can isolate the issue. In order to get the details of the Heap Objects collect “HeapDump” either using JHat (not a better tool) or JMap (Much Better compared to the Jhat tool). Plese refer to the post to know how to do it : http://middlewaremagic.com/weblogic/?p=2241
.
Point-3). Once we collected the Heap Dump we can easily monitor the Heap Details using best GUI toold like “Jhat Web Browser” or using “Eclipse Memory Analyzer”.

OutOfMemoryError: GC overhead limit exceeded?

The “GC overhead limit exceeded “ indicates that, more than 98% of the total time is spent doing garbage collection and less than 2% of the heap is recovered.

The “GC overhead limit exceeded” in general represent the following cause:
Point-1). When the heap is too small or the current size might not be suitable for your application. Try increasing the -Xmx value while starting your process.

Point-2). There might be a memory leak which means a particular kind of object might be getting created again and again but might not be getting garbage collected due to a leak in the code (application code/ third party application code, Application Server code leak, Or it may be a JVM memory leak).

Point-3). The old generation size of the heap might be very small compared to the new generation. So that the object might be getting passed to the Old Generation prematurely. And we know that GC happens less frequently in Old Generation compared to the Young Generation.

Point-4). If increasing the Heap size (-Xmx) OR tuning the Old Generation size does not help then it might be a memory leak in the application code/container code.

Better to take a heap dump and see what kind of objects are getting filled up inside the Heap, That will indicate which might be leaking or if the heap size is sufficient or not.

What to do in case of Native OutOfMemory?

Point-1). Usually Native OutOfMemory causes Server/JVM Crash. So it is always recommended to apply the following JAVA_OPTIONS flags in the Server Start Script to instruct the JVM to generate the HeapDump  “-XX:+HeapDumpOnOutOfMemoryError
By default the heap dump is created in a file called java_pidpid.hprof in the working directory of the VM, as in the example above. You can specify an alternative file name or directory with the “-XX:HeapDumpPath=C:/someLocation/
.
Note: Above Flags are also suitable to collect HeapDump in case of JavaHeap OutOfMemory as well. But these flags never gurantees that the JVM will always generate the Heap Dump in case of any OutOfMemory Situation.
.
Point-2). Usually in case of Native OutOfMemory a “hs_err_pid.log” file is created in case of Sun JDK and “xxxx.dump” file is created in case of JRockit JDK. These log files are usually Text Files and tells about the Libraries which caused the Crash. These files need to be collected and analyzed to find out the root cause.
.
Point-3). Make Sure that the -XX:MaxHeapSize is not set to a Very Large Space…because it will cause a very less Native Space allocation. Because as soon as we increase the HeapSize, the Native Area decreases. Please see the Post: http://middlewaremagic.com/weblogic/?p=4456
.
Point-4). Keep Monitoring the process’s memory using the Unix utility ‘ps’ like following:
ps -p <PID> -o vsz
Here you need to pass the WebLogic Server’s PID (Process ID) to get it’s Threading Details with respect to the Virtual Memory Space.
.
Point-5). If the Heap Usages is less Or if you see that Your Application usages less Heap Memory then it is always better to reduls the MaxHeapSize  so that the Native Area will automatically gets increased.
.
Point-6). Sometimes the JVMs code optimization causes Native OutOfMemory or the Crash…So in this case we can disable the Code Optimization feature of JVM.
(Note: disabling the Code Optimization of JVM will decrease the Performance of JVM)
For JRockit JVM Code Optimization can be disabled using JAVA_OPTION  –Xnoopt
For Sun JDK Code Optimization can be disabled using   JAVA_OPTION  -Xint

What to do in case of OutOfMemory In PermGen?

Point-1). Make Sure that the PermGen Area is not set to a very less value.
.
Point-2). Usually if an Application has Many JSP Pages in that case every JSP will be converted to a *.class file before JSP Request Process. So a large number of JSPs causes generation of a Large number of *.class files all these classes gets loaded in the PermGen area.
.
Point-3). There is no standard formula to say which value of MaxPermSize will suit your requirement. This is because it completely depends on the kind of framework,APIs, number of JSPs…etc you are using in your application. The number of class which has to be loaded will vary based on that. but if you want to really tune the MaxPermSize then you should first start with some base value like 512M or 256M and then If you still get the OutOfMemory then please follow below instruction to troubleshoot it.
Point-4). If you are repeatedly getting the OutOfMemory in PermGen space then it could be a Classloader leak….
May be some of the classes are not being unloaded from the permgen area of JVM . So please try to increase the -XX:MaxPermSize=512M  or little more and see if it goes away.
If not then add the following JAVA_OPTIONS to trace the classloading and unloading to find out the root cause :
-XX:+TraceClassloading and -XX:+TraceClassUnloading
Point-5).

If users want to investigate which kind of classes are consuming more PermGen space then we can use the “$JAVA_HOME/bin/jmap” utility as following:

    $JAVA_HOME/bin/jmap -permstat $PID  >& permstat.out

Above utility will dump the list of classes loaded in that JVM process Process (we are passing the processID to this command as $PID). This helps us in understanding if there is any classloader leak or if a particular class is consuming more memory in PermGen…etc Collecting HeapDump also gives a good idea on this.

.
.
Thanks
Jay SenSharma

Copyright © 2010-2012 Middleware Magic. All rights reserved. |