JVM Tuning

OutOfMemory Causes and First Aid Steps?

Hi All,
Jay SenSharma

Jay SenSharma

In response to the Comment: http://middlewaremagic.com/weblogic/?p=4456#comment-2026 We developed a Post on Java Heap. Thanks to “Swordfish” for querying us very important topic.
.
In most of the environments we usually see some kind of Memory related issues like OutOfMemory/Server Slowless kind of things. most of the times it is related to the in accurate tuning of the JVM and some times it happens due to the Application Frameworks Bug/ In accurate tuning configuration or it may happen due to the Application Code as well ..like Object Leakin in the Application code.
.
NOTE: The pre requisit for this Post is that you are already aware of different Memory Spaces available as part of the Java Process..If Not then Please quickly review: http://middlewaremagic.com/weblogic/?p=4456
.
Here we are going to see that what causes the OutOfMemory issues  and Why it happens along with some basic First Aid Steps to debug this kind of  issues.

What is OutOfMemory?

An OutOfMemory is a condition in which there is not enough space left for allocating required space for the new objects or libraries or native codes. OutOfMemory can be divided in tow main categories:

1). OutOfMemory in Java Heap:

This happens when the JVM is not able to allocate the required memory space for a Java Object. There may be many reasons behind this…like
Point-1). Very Less Heap Size allocation. Means setting the MaxHeapSize (-Xmx) parameter to a very less value.
.
Point-2). The Leaking of Objects. Either the Application is not unreferencing the unused Objects or the Third part frameworks (Hibernate/Spring/Seam…etc) might not be releasing the references of the objects due to some inaccurate configurations.
.
Point-3). In Many cases it may be the reason that Application codes are getting the JDBC connections objects from the DataSource are not being released back to the Connection Pool.
.
Point-4). Garbage Collection strategy may be in correct according to the environmental/application requirements.
.
Point-5). In-accurate setting of Application/Frameworks Cache.
Example:
Exception in thread "Thread-10" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(Abs tractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStr ingBuilder.java:390)
at java.lang.StringBuilder.append(StringBuilder.java: 119)
at java.lang.Throwable.toString(Throwable.java:344)

2). Native OutOfMemory:

Native OutOfMemory is a scenario when the JVM is not able to allocate the required Native Libraries and JNI Codes in the memory.
Native Memory is an area which is usually used by the JVM for it’s internal operations and to execute the JNI codes. The JVM Uses Native Memory for Code Optimization and for loading the classes and libraries along with the intermediate code generation.
The Size of the Native Memory depends on the Architecture of the Operating System and the amount of memory which is already commited to the Java Heap. Native memory is an Process Area where the JNI codes gets loaded or JVM Libraries gets loaded or the native Performance packs and the Proxy Modules gets loaded…
Native OutOfMemory can happen due to the following main reasons:
.
Point-1). Setting very small StackSize (-Xss). StackSize is a memory area which is allocated to individual threads where they can place their thread local objects/variables.
.
Point-2). Usually it may be seen because of Tuxedos incorrect setting. WebLogic Tuxedo Connectors allows the interoperability between the Java Applications deployed on WebLogic Server and the Native Services deployed on Tuxedo Servers. Because Tuxedos uses JNI code intensively.
.
Point-3). Less RAM or Swap Space.
Example: For details on this kind of error Please refer to: http://middlewaremagic.com/weblogic/?p=422
.
Point-4). Usually it may occur is our Application is using a very large number of JSPs in our application. The JSPs need to be converted into the Java Code and then need to be compiled. Which reqires DTD and Custom Tag Library resolution as well. Which usually consumes more native memory.
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:574)
at TestXss.main(TestXss.java:18)

3).  OutOfMemory in PermGen Space:

Permanent Generation is a Non-Heap Memory Area inside the JVM Space. Manytimes we see OutOfMemory in this Area. PermGen Area is NOT present in JRockit JVMs. For more details on this area please refer to: http://middlewaremagic.com/weblogic/?p=4456.
.
The PermGen Area is measured independently from the other generations because this is the place where the JVM allocates Classes, Class Structures, Methods and Reflection Objects. PermGen is a Non-Heap Area.It means we DO NOT count the PermGen Area as part of Java Heap.
The OutOfMemory in PermGen Area can be seen because of the following main reasons:
Point-1). Deploying and Redeploying a very Large Application which has many Classes inside it.
.
Point-2). If an Application is getting deployed/Updated/redeployed repeatedly using the Auto Deployment feature of the Containers. In that case the Classes belonging to the application stays un cleaned and remains in the PermGen Area without Class Garbage Collection.
.
Point-3). If  “-noclassgc” Java Option is added while starting the Server. In that case the Classes instances which are not required will not be Garbage collected.
.
Point-4). Very Less Space for allocated the “=XX:MaxPermGen”
Example: you can see following kind of Trace in the Server/Stdout Logs:
<Notice> <Security> <BEA-090171> <Loading the identity certificate and private key stored under the alias DemoIdentity from the jks keystore file D:ORACLEMIDDLE~1WLSERV~1.3serverlibDemoIdentity.jks.>
Exception in thread "[STANDBY] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'" java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)

What to do in case of OutOfMemory In JavaHeap?

Whenever we see an OutOfMemory in the server log or in the stdout of the server. We must try to do the following things as first aid steps:
.
Point-1). If possible enable the following JAVA_OPTIONS in the server start Scripts to get the informations of the Garbage Collection status.
-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails  -Xloggc:/opt/app/GCLogsDirectory/gc.log
.
Point-2). It is always needed to see what all objects were present when the OutOfMemory error occured to identify whether those objects belongs to the Application Code/ Application Framework Codes/ The Application Server APIs. Sothat we can isolate the issue. In order to get the details of the Heap Objects collect “HeapDump” either using JHat (not a better tool) or JMap (Much Better compared to the Jhat tool). Plese refer to the post to know how to do it : http://middlewaremagic.com/weblogic/?p=2241
.
Point-3). Once we collected the Heap Dump we can easily monitor the Heap Details using best GUI toold like “Jhat Web Browser” or using “Eclipse Memory Analyzer”.

OutOfMemoryError: GC overhead limit exceeded?

The “GC overhead limit exceeded “ indicates that, more than 98% of the total time is spent doing garbage collection and less than 2% of the heap is recovered.

The “GC overhead limit exceeded” in general represent the following cause:
Point-1). When the heap is too small or the current size might not be suitable for your application. Try increasing the -Xmx value while starting your process.

Point-2). There might be a memory leak which means a particular kind of object might be getting created again and again but might not be getting garbage collected due to a leak in the code (application code/ third party application code, Application Server code leak, Or it may be a JVM memory leak).

Point-3). The old generation size of the heap might be very small compared to the new generation. So that the object might be getting passed to the Old Generation prematurely. And we know that GC happens less frequently in Old Generation compared to the Young Generation.

Point-4). If increasing the Heap size (-Xmx) OR tuning the Old Generation size does not help then it might be a memory leak in the application code/container code.

Better to take a heap dump and see what kind of objects are getting filled up inside the Heap, That will indicate which might be leaking or if the heap size is sufficient or not.

What to do in case of Native OutOfMemory?

Point-1). Usually Native OutOfMemory causes Server/JVM Crash. So it is always recommended to apply the following JAVA_OPTIONS flags in the Server Start Script to instruct the JVM to generate the HeapDump  “-XX:+HeapDumpOnOutOfMemoryError
By default the heap dump is created in a file called java_pidpid.hprof in the working directory of the VM, as in the example above. You can specify an alternative file name or directory with the “-XX:HeapDumpPath=C:/someLocation/
.
Note: Above Flags are also suitable to collect HeapDump in case of JavaHeap OutOfMemory as well. But these flags never gurantees that the JVM will always generate the Heap Dump in case of any OutOfMemory Situation.
.
Point-2). Usually in case of Native OutOfMemory a “hs_err_pid.log” file is created in case of Sun JDK and “xxxx.dump” file is created in case of JRockit JDK. These log files are usually Text Files and tells about the Libraries which caused the Crash. These files need to be collected and analyzed to find out the root cause.
.
Point-3). Make Sure that the -XX:MaxHeapSize is not set to a Very Large Space…because it will cause a very less Native Space allocation. Because as soon as we increase the HeapSize, the Native Area decreases. Please see the Post: http://middlewaremagic.com/weblogic/?p=4456
.
Point-4). Keep Monitoring the process’s memory using the Unix utility ‘ps’ like following:
ps -p <PID> -o vsz
Here you need to pass the WebLogic Server’s PID (Process ID) to get it’s Threading Details with respect to the Virtual Memory Space.
.
Point-5). If the Heap Usages is less Or if you see that Your Application usages less Heap Memory then it is always better to reduls the MaxHeapSize  so that the Native Area will automatically gets increased.
.
Point-6). Sometimes the JVMs code optimization causes Native OutOfMemory or the Crash…So in this case we can disable the Code Optimization feature of JVM.
(Note: disabling the Code Optimization of JVM will decrease the Performance of JVM)
For JRockit JVM Code Optimization can be disabled using JAVA_OPTION  –Xnoopt
For Sun JDK Code Optimization can be disabled using   JAVA_OPTION  -Xint

What to do in case of OutOfMemory In PermGen?

Point-1). Make Sure that the PermGen Area is not set to a very less value.
.
Point-2). Usually if an Application has Many JSP Pages in that case every JSP will be converted to a *.class file before JSP Request Process. So a large number of JSPs causes generation of a Large number of *.class files all these classes gets loaded in the PermGen area.
.
Point-3). There is no standard formula to say which value of MaxPermSize will suit your requirement. This is because it completely depends on the kind of framework,APIs, number of JSPs…etc you are using in your application. The number of class which has to be loaded will vary based on that. but if you want to really tune the MaxPermSize then you should first start with some base value like 512M or 256M and then If you still get the OutOfMemory then please follow below instruction to troubleshoot it.
Point-4). If you are repeatedly getting the OutOfMemory in PermGen space then it could be a Classloader leak….
May be some of the classes are not being unloaded from the permgen area of JVM . So please try to increase the -XX:MaxPermSize=512M  or little more and see if it goes away.
If not then add the following JAVA_OPTIONS to trace the classloading and unloading to find out the root cause :
-XX:+TraceClassloading and -XX:+TraceClassUnloading
Point-5).

If users want to investigate which kind of classes are consuming more PermGen space then we can use the “$JAVA_HOME/bin/jmap” utility as following:

    $JAVA_HOME/bin/jmap -permstat $PID  >& permstat.out

Above utility will dump the list of classes loaded in that JVM process Process (we are passing the processID to this command as $PID). This helps us in understanding if there is any classloader leak or if a particular class is consuming more memory in PermGen…etc Collecting HeapDump also gives a good idea on this.

.
.
Thanks
Jay SenSharma

Parts Of JVM And JVM Architecture Diagram?

Hi All,
Jay SenSharma

Jay SenSharma

JVM is the heart of any Java based Application Server. We face most of the issues deu to incorrect JVM tunning. It is very important to understand the Overall architecture of the JVM in order to trouble shoot different JVM tunning related issues.Here we are going to discuss the Architecture and the Major parts of a Java Process And the Java Heap Division.
.
The Following Diagram is just a basic overview of a Java Process in a 2 GB process Size Machine. Usually in 32 bit Windows Operating Systems the default process size will be 2 GB (In Unix based 64 bit operating Systems it can be 4GB or more). So i draw the following Diagram of Java Process to explain the Java Process partitions in a 2Gb process size machine.
Java Process Architecture Diagram

Java Process Architecture Diagram

In the above diagram we will find different partitions of a Java Process. Please compare the above diagram with below descriptions.
.
1). Just for Example we can see that Process Size is 2048 MB (2GB)
2). The Java Heap Size is 1024MB (means 1GB)   -Xmx1024m
3). Native Space = ( ProcessSize – MaxHeapSize – MaxPermSize) It means around 768 MB of Native Space.
4). MaxPermSpace is around -XX:MaxPermSize=256m
5). Young Generation Space is around    40% of Maximum Java Heap.

Could not reserve enough space for object heap

Many times while providing some value for the Max Heap or Max Perm we get the following kind of error:

Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.

In above case users should decrease JVM process size by decreasing the heap or permgen size or the Xss value. The cause of this error is insufficient memory.

The maximum theoretical heap limit for the 32-bit JVM is 4G. Due to various additional constraints such as available swap, kernel address space usage, memory fragmentation, and VM overhead, in practice the limit can be much lower. On most modern 32-bit Windows systems the maximum heap size will range from 1.4G to 1.6G. On 32-bit Solaris kernels the address space is limited to 2G. On 64-bit operating systems running the 32-bit VM, the max heap size can be higher, approaching 4G on many Solaris systems. As of Java SE 6, the Windows /3GB boot.ini feature is not supported.

Why to choose a 64-bit JVM?
A 64-bit capable J2SE is an implementation of the Java SDK (and the JRE along with it) that runs in the 64-bit environment of a 64-bit OS on a 64-bit processor.

The primary advantage of running Java in a 64-bit environment is the larger address space. This allows for a much larger Java heap size and an increased maximum number of Java Threads, which is needed for certain kinds of large or long-running applications. The primary complication in doing such a port is that the sizes of some native data types are changed. Not surprisingly the size of pointers is increased to 64 bits. On Solaris and most Unix platforms, the size of the C language long is also increased to 64 bits. Any native code in the 32-bit SDK implementation that relied on the old sizes of these data types is likely to require updating.

What Are these Different Parts?

Eden Space:

Eden Space is a Part of Java Heap where the JVM initially creates any objects, where most objects die and quickly are cleanedup by the minor Garbage Collectors (Note: Full Garbage Collection is different from Minor Garbage Collection). Usually any new objects created inside a Java Method go into Eden space and the objects space is reclaimed once the method execution completes. Where as the Instance Variables of a Class usually lives longer until the Object based on that class gets destroyed. When Eden fills up it causes a minor collection, in which some surviving objects are moved to an older generation.

Survivor Spaces:

Eden Sapce has two Survivor spaces. One survivor space is empty at any given time. These Survivor Spaces serves as the destination of the next copying collection of any living objects in eden and the other survivor space.
The parameter SurvivorRatio can be used to tune the size of the survivor spaces.
-XX:SurvivorRatio=6 sets the ratio between each survivor space and eden to be 1:6
If survivor spaces are too small copying collection overflows directly into the tenured generation.

Young Generation: (-XX:MaxNewSize)

Till JDK1.3 and 1.4 we used to set the Young Generation Size using -XX:MaxNewSize. But from JDK1.4 onwards we set the YoungGeneration size using (-Xmn) JVM option.
Young Generation size is controlled by NewRatio.  It means setting -XX:NewRatio=3 means that the ratio between the Old Generation and the Young Generation is  1:3
.
Similarly -XX:NewRatio=8 means that 8:1 ratio of tenured and young generation.
NewRatio: NewRatio is actually the ratio between the (YoungGenaration/Old Generations) has default values of 2 on Sparc , 12 on client Intel, and 8 everywhere else.
NOTE: After JDK 1.4 The Young Generation Size can be set using  (-Xmn) as well.

Virtual Space-1: (MaxNewSize – NewSize)

The First Virtual Space is actually shows the difference between the -XX:NewSize and -XX:MaxNewSize.  Or we can say that it is basically a difference between the Initial Young Size and the Maximum Young Size.

Java Heap Area: (-Xmx and -Xms)

Java Heap is a Memory area inside the Java Process which holds the java objects.  Java Heap is a combination of Young Generation Heap and Old Generation Heap. We can set the Initial Java Heap Size using -Xms JVM parameter similarly if we want to set the Maximum Heap Size then we can use -Xmx JVM parameter to define it.

Example:
-Xmx1024m —> Means Setting the Maximum limit of Heap as 1 GB
-Xms512m —> Means setting Java Heap Initial Size as 512m
.
NOTE-1): It is always recommended to set the Initial and the Maximum Heap size values as same for better performance.
NOTE-2): The Theoretical limitation of Maximum Heap size for a 32 bit JVM is upto 4GB. Because of the Memory Fragmentation, Kernel Space Addressing, Swap memory usages and the Virtual Machine Overheads are some factors JVM does not allow us to allocate whole 4GB memory for Heap in a 32 bit JVM. So usually on 32-bit Windows Operating Systems the Maximum can be from 1.4 GB to 1.6 GB.
.
If we want a Larger memory allocation according to our application requirement then we must choose the 64-bit operating systems with 64 bit JVM. 64-bit JVM provides us a larger address space. So we can have much larger Java Heap  with  the increased number of Threads allocation area. Based on the Nature of your Operating system in a 64 bit JVM you can even set the Maximum Heap size upto 32GB.
Example:        -Xms32g -Xmx32g -Xmn4g

Virtual Space-2: (MaxHeapSize – InitialHeapSize)

The Second Virtual Space is actually the Difference between the Maximum Heap size (-Xmx)and the Initial Heap Size(-Xms). This is called as virtual space because initially the JVM will allocate the Initial Heap Size and then according to the requirement the Heap size can grow till the MaxHeapSize.
.

PermGen Space: (-XX:MaxPermSize)

PermGen is a non-heap memory area where the Class Loading happens and the JVM allocates spaces for classes, class meta data,  java methods and the reference Objects here. The PermGen is independent from the Heap Area. It can be resized according to the requirement using -XX:MaxPermSize and -XX:PermSize  JVM Options. The Garbage collection happens in this area of JVM Memory as well. The Garbage collection in this area is called as “Class GC”. We can disable the Class Garbage Collection using the JVM Option -noclassgc. if  “-noclassgc” Java Option is added while starting the Server. In that case the Classes instances which are not required will not be Garbage collected.

Native Area:

Native Memory is an area which is usually used by the JVM for it’s internal operations and to execute the JNI codes. The JVM Uses Native Memory for Code Optimization and for loading the classes and libraries along with the intermediate code generation.
The Size of the Native Memory depends on the Architecture of the Operating System and the amount of memory which is already commited to the Java Heap. Native memory is an Process Area where the JNI codes gets loaded or JVM Libraries gets loaded or the native Performance packs and the Proxy Modules gets loaded.
There is no JVM Option available to size the Native Area. but we can calculate it approximately using the following formula:
NativeMemory = (ProcessSize – MaxHeapSize – MaxPermSize)
.
.
Thanks
Jay SenSharma

What Is Server Hang And What Need To Be Done ?

Hi All,

Jay SenSharma

Jay SenSharma

Thread dump analysis is the most important part to find out the Server slow responsiveness or Hang Server Situation or sometimes a Crash scenario as well if happened because of the Stuck threads. Here we are going to see some very basic but important features of  WebLogic Thread Model and the functionality and tasks performed by these Threads. Here we will talk about some very common terminology which we use while analyzing the Thread Dump or Hang Server Situation.

What is Hang Server Situation and It’s Symptoms?

1). If a server responses very slow compared to the estimated time of response then it may be moving to total Hang Server Situation.

2). If a Server does not even response to the clients requests, then it is a Complete Server Hang scenario. Some times the complete stuckness of a Server also may cause a Server Crash.

Roles & Responsibilities  Of WebLogic Threads?

The WebLogic Threads can be broadly categorized into 2 Main categories.
1). WebLogic Execute Threads: These Threads are responsible for processing the clients/users requests. This decides that how many tasks a server can be performed in parallel.  By default in development mode the WLS Server will have 15 Execute Threads and if we run the Server in Production mode then the WLS Server will have 25 Execute Threads as Default. Increasing the Execute Thread count does not mean increase in performance…rather in some cases it may even degrade the performance.
2). WebLogic Socket Reader Threads: These threads listens to the incoming clients requests, Means basically these Threads deals with the Network traffic part.  These are basically a percentage of Execute Threads. The Default ration between is Execute Threads and Socket Reader Threads is 33%. Means 33% Threads will be Socket Readers as default. Which can be tuned according to the requirement.
Example: if we have 15 Execute Threads in Development Mode of WLS Server then it simply means that around 33% of these threads will work as Socket Reader Threads and rest Threads will be processing the Clients request.

What are Execute Queues?

Execute Queue is a group of Execute Threads which will be taking care of designated Servlets / RMI Ojects/ EJBs/ JSPs…etc. Till WLS8.1 we could see these Execute Queues informations  available as part of “config.xml” file entry. But from WLS 9.x onwards as the WebLogic Threading Model is changed because of the introduction of WorkManagers, we wont see the Execute Queues details by default in WLS9.x and later WLS. But still if we want to use the WLS8.1 style of Threading model in WLS9.x or Later versions of WebLogic then we can use the  the following JAVA_OPTION :  -Dweblogic.Use81StyleExecuteQueues=true

Possible Cause of Server Hang?

There may be many reasons behind Server Slow responsiveness or Hang scenario…

Cause 1). If the Free Heap memory is very less then the the Threads will NOT be able to create required objects in the Java Heap.

Cause 2). Insufficient number of Threads. It happens some times that if the Load (number of users request) on Server suddenly increased and the MaxThreadCount is not set to a correct value then Server will not be able to process these many requests.

Cause 3). If the Garbage Collection is taking much time in that case the Garbage Data clean up process will take longer time and the Threads will be doing Garbage Collection rather than processing the clients request. Or the Threads will be waiting for some free memory to create some objects in the heap.

Cause 4). Sometimes Java Code optimization also causes a temporary hang scenario, because the code optimization is a little heavy process but useful for better performance.

Cause 5). Many Remote Jdbc Lookups can sometimes cause the Hang scenario.

Cause 6). In accurate JSP compilation settings. (recommended always precompile the JSPs before deploying it to the production environments and the  PageCheckSeconds must be set correctly sothat the JSP compilation will not happen very frequently)

Cause 7). Application code Deadlock or Jdbc Driver Deadlocks or Vendor API Bug: When threads waits in infinite loop to gain lock on objects… Example scenario below
Example:
1).  Thread_A has Gained Lock on Object Obj_A
2).  Thread_B Gained Lock on Object Obj_B
3).  Now After performing some operations on Obj_A the Thread_A  is trying to get Lock on Obj_B (Obj_B is already locked by Thread_B currently)
4).  Similarly after performing some operations on Obj_B the Thread_B wants to gain Lock on Obj_A (Obj_A is already locked by Thread_A)

In Above Scenario Now Both the Threads are waiting for each other to release the Lock on their object … but none of them is actually releasing the lock from the Objects which they Already have Locked.

Cause 8). If the Number of File Descriptors are very less (insufficient resources) in this case also we might face slow server response or Server Hang situations.

How the Server Log will Look like in case of Hang Scenarios or Slow Responsiveness?

By Default 600 Seconds (10 Minutes) is the default duration after which the WebLogic Server declares a Thread as a STUCK Thread. The Entry of Stuck Thread occurance can be seen in the Server Logs. You can see following kind of entry in the Server Logs…

<Warning> <WebLogicServer> <BEA-000337> <ExecuteThread: '7' for queue: 'weblogic.kernel.Default' has been busy for "630" seconds working on the request "weblogic.ejb20.internal.JMSMessagePoller@d64412", which is more than the configured time (StuckThreadMaxTime) of "600" seconds.>

Does the Above kind of Entry in Server Log means WebLogic is Hang?

Above message in server Log is just an indication that some Threads are taking longer time to process some requests which may lead to a Hang Server Situation and of course Slow responsiveness of WebLogic Server. But it doesn’t mean 100% that WebLogic cannot recover these Threads.

WebLogic has a capability to Declare a STUCK thread as UNSTUCK. Weblogic would priodically check for stuck threads based on settings of “Stuck Thread Max Time” and “Stuck Thread Timer Interval”  http://e-docs.bea.com/wls/docs103/ConsoleHelp/taskhelp/tuning/TuningExecuteThreads.html.

Weblogic will just report the thread as stuck and in case the thread progresses(may be it seemed to be stuck but was actually running a long transaction), weblogic would declare it as unstuck.
Many times it happens that our Application requirement says that a Thread can take more time to process the Clients request (more than 600 Seconds) In that case we can change the “StuckThreadMaxTime” interval.  Example if we know that our Application has some long running Jdbc queries which may take upto 900 Seconds then we can increase the StuckThreadMaxTime.

What First Aid Steps Required to Collect Debug Data ?

Debugging-1). Try to ping the WebLogic Server 5-6 times using “weblogic.Admin PING” utility to see how quickly we are getting the response back?

java weblogic.Admin -url t3://StuckThreadHostName:9001  -username weblogic -password weblogic  PING

Debugging-2). As soon as you get time check the Verbose GC Logs (Garbage Collection Logs if you have already enabled the Garbage Collection Logging by applying the following JAVA_OPTIONS)

set JAVA_OPTIONS=%JAVA_OPTIONS%  -Xloggc:/opt/logs/gc.log -XX:+PrintGCDetails -Xmx1024m -Xms1024m

Debugging-3). Collect at least 4-5 Thread Dumps taken in the interval of 9-10 seconds each to see the Activities of Threads. Follow the various ways of collecting Thread Dumps:  http://middlewaremagic.com/weblogic/?p=823

Debugging-4). Check If the Load (Number of Clients Request) is/was abnormally very High on the Server?

Debugging-5). Check if the JMS Subsystem connectivity or the Database connectivity is lost somewhere… You may find in the Server Logs some Strange entries like DataSource is Disables/ Network Adapter could not Establish Connection to the Database/ JMS Messaging System “PeerGoneException” ….. Tuxedo/Jolt Connectivity Errors…etc.
.

.

Thanks

Jay SenSharma


Copyright © 2010-2012 Middleware Magic. All rights reserved. |