Tag: debugging

WebLogic Server is in ADMIN State ?

Hi,

Jay SenSharma

Jay SenSharma

In Response to Mr. Chris Giddings.. Comment/Query on ADMIN State.
Here we are going to see a Scenario which is very common but troublesome. Many times we observe that while trying to restart the Managed Servers the Servers move to ADMIN State rather than moving to the RUNNING State.  This usually happens  If it is unable to activate some of the Modules which are deployed on this Server. usually it happens If any Application is Not getting activated properly or a Datasource or JTA recovery or a JMS System / SubSystem is not getting activated properly then also it moves into the Admin State. So To find out the Root cause we need to keenly observer the Logs. Specially we need to check what configuration changes we made on the Server recently.

More Alerts On the same Monitoring:  http://middlewaremagic.com/weblogic/?p=5838

Most Probable Cause: Most of the cases it happens if any of the Database is down and WebLogic tries to create the Connection Pool at the boot up time.  One way to avoid this kind of scenario is to set the InitialCapacity of the DataSource to  0 (Zero) so that weblogic will not try to create any JDBC Connection object at the start up time so like this we can avoid Connection creation failure conditions and avoid moving our server to ADMIN State.

NOTE: We have an option to forcibily move our WLS Server to the RUNNING State (from ADMIN State) which works almost 80% cases….Still it is must that we find the root cause of Why the Server is moving to ADMIN State.  So please consider this article as a Workaround …And not as a solution. Because even if we can bring our server in RUNNING state …still we have to find out why the Server Moved to ADMIN State.

In this Demonstration we will see a simple WLST script which will check if any of the server present inside Domain is in ADMIN State or not?  If YES then it will try to force them to move to RUNNING State.

Step1). Create a Directory somewhere in your file system like :  “C:WLST_AdminStateCheck”

Step2). Write a Properties file “domain.properties” inside “C:WLST_AdminStateCheck” like following:

domain.name=7001_Plain_Domain
admin.url=t3://localhost:7001
admin.userName=weblogic
admin.password=weblogic1

totalServersToMonitor=2
server.1.url=t3://localhost:7001
server.2.url=t3://localhost:7003

Step3). Write the  WLST Script “serverAdminState.py” inside “C:WLST_AdminStateCheck” directory.


#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################

from java.io import FileInputStream

propInputStream = FileInputStream("domain.properties")
configProps = Properties()
configProps.load(propInputStream)

domainName=configProps.get("domain.name")
adminURL=configProps.get("admin.url")
adminUserName=configProps.get("admin.userName")
adminPassword=configProps.get("admin.password")
totalServerToMonitor=configProps.get("totalServersToMonitor")

i=1
while (i <= int(totalServerToMonitor)) :
	url=configProps.get("server."+ str(i)+".url")
	connect(adminUserName,adminPassword,url)
	serverRuntime()
	state=cmo.getState()
	name=cmo.getName()
	if state == 'ADMIN' :
		print "ALERT::::::::Server Name: " + name + " Is currently in State: " + state
		try:
			print 'Resuming Server: .....'
			cmo.resume()
			print "Server: "+name +"Moved to State : " + cmo.getState()
		except:
			print "NOTE:::::::::Unable to Move Server: " + name + " To good State"
	else:
		print ''
		print ''
		print "GOOD::::::::> Server Name: " + name + " Is currently in State: " + state + '                     :)'
	i = i + 1

Step4). Run the “. ./setWLSEnv.sh” by adding two DOTs separated by a single space …..before the actual script like following : (use ‘cd’ command to move inside the <BEA_HOME>/wlserver_10.3/server/bin) then run the following command….
.  ./setWLSEnv.sh

Note: the first DOT represents that set the Environment in the current Shell, AND the second ./ represents execute the script from the current directory.

Step5). Now run the WLS Script like following:

java        weblogic.WLST        serverAdminState.py

Following would be the output:

java weblogic.WLST serverAdminState.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Connecting to t3://localhost:7001 with userid weblogic ...
Successfully connected to Admin Server 'AdminServer' that belongs to domain '7001_Plain_Domain'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)

GOOD::::::::> Server Name: AdminServer Is currently in State: RUNNING                     🙂
Connecting to t3://localhost:7003 with userid weblogic ...
Successfully connected to managed Server 'ManagedServer-1' that belongs to domain '7001_Plain_Domain'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

ALERT::::::::Server Name: ManagedServer-1 Is currently in State: ADMIN
Resuming Server: .....
Server: ManagedServer-1Moved to State : RUNNING

.
.
Regards,
Jay SenSharma


WebLogic SNMP Monitoring for Server Health State

Hi,

Jay SenSharma

Jay SenSharma

This post was created as one of our subscriber Shashi.rsb was facing an issue in creating a SNMP trap for server health state, hence I have created this post. Here is a very simple demonstration and configuration to receive the SNMP traps for the health state of a Managed Server running as part of our WebLogic Domain. In this demo we will see how we can configure the SNMP Trap and SNMP Agent to provide us the  Health State Information of a WebLogic Server. Usually if the Server’s Health state is HEALTH_OK,

At any point of time our WebLogic server may have in any of the following Health States:

public static final int HEALTH_OK;
public static final int HEALTH_WARN;
public static final int HEALTH_CRITICAL;
public static final int HEALTH_FAILED;
public static final int HEALTH_OVERLOADED;

So based on your requirement you can choose any Health State for SNMP monitoring … I am using HEALTH_OK for this demo.

Also Notice a kind of BUG:  http://middlewaremagic.com/weblogic/?p=6149#comment-3264

Step1). Start WeLogic Admin Server and login to Admin Console and then create a SNMP Agent like following:

SNMP_HealthState_1

SNMP_HealthState_1

Step2). Now create a String Monitor  By Clicking on   “Diagnostics —> SNMP—> ServerSNMPAgent-0—-> String Monitors”

SNMP_HealthState_2

SNMP_HealthState_2

Step3). Now create a Trap Destination “Diagnostics —> SNMP—> ServerSNMPAgent-0—-> Trap Destinations”

SNMP_HealthState_3

SNMP_HealthState_3

Step4). After making all the above configurations  your “”  file will look ike following:

  <snmp-agent-deployment>
    <name>ServerSNMPAgent-0</name>
    <enabled>true</enabled>
    <send-automatic-traps-enabled>true</send-automatic-traps-enabled>
    <snmp-port>1161</snmp-port>
    <snmp-trap-version>1</snmp-trap-version>
    <community-prefix>public</community-prefix>
    <snmp-trap-destination>
      <name>TrapDestination-0</name>
      <host>10.10.10.10</host>
      <port>1165</port>
      <community>public</community>
      <security-level>noAuthNoPriv</security-level>
    </snmp-trap-destination>
    <snmp-string-monitor>
      <name>SNMPStringMonitor-1</name>
      <enabled-server>AdminServer,MS-1</enabled-server>
      <monitored-m-bean-type>ServerRuntime</monitored-m-bean-type>
      <monitored-m-bean-name>MS-1</monitored-m-bean-name>
      <monitored-attribute-name>HealthState</monitored-attribute-name>
      <polling-interval>10</polling-interval>
      <string-to-compare>HEALTH_OK</string-to-compare>
      <notify-differ>false</notify-differ>
      <notify-match>false</notify-match>
    </snmp-string-monitor>
    <community-based-access-enabled>true</community-based-access-enabled>
    <snmp-engine-id>ServerSNMPAgent-0</snmp-engine-id>
    <authentication-protocol>noAuth</authentication-protocol>
    <privacy-protocol>noPriv</privacy-protocol>
    <inform-retry-interval>10000</inform-retry-interval>
    <max-inform-retry-count>1</max-inform-retry-count>
    <localized-key-cache-invalidation-interval>3600000</localized-key-cache-invalidation-interval>
    <snmp-access-for-user-m-beans-enabled>false</snmp-access-for-user-m-beans-enabled>
    <inform-enabled>false</inform-enabled>
    <master-agent-x-port>1705</master-agent-x-port>
    <target>AdminServer</target>
  </snmp-agent-deployment>

Step5). Restart your AdminServer.

Step6). Now start the  ”SnmpTrapMonitor like following: (Make Sure to run the “. ./setWLSEnv.sh” to set the environment in the same shell prompt where you are planning to run the following command)

java   weblogic.diagnostics.snmp.cmdline.Manager   SnmpTrapMonitor   -p   1165

Step7). Now as soon as you will start your Managed Server you will get the following kind of Trap in the above Shell/Command prompt  once the ManagedServer (MS-1) have the HEALTH_OK state:

java weblogic.diagnostics.snmp.cmdline.Manager SnmpTrapMonitor -p 1165
Listening on port:1165
--- Snmp Trap Received ---
    Version        : v1
    Source         : UdpEntity:10.10.10.10:1161
    Community      : public
    Enterprise     : enterprises.140.625
    TrapOID        : enterprises.140.625.0.65
    RawTrapOID     : 1.3.6.1.4.1.140.625.0.65
    Trap Objects   : {
   { enterprises.140.625.100.5=Mon Mar 21 20:17:02 IST 2011 }
   { enterprises.140.625.100.10=MS-1 }
}
    Raw VarBinds   : {
   { enterprises.140.625.100.5=Mon Mar 21 20:17:02 IST 2011 }
   { enterprises.140.625.100.10=MS-1 }
}
--- Snmp Trap Received ---
    Version        : v1
    Source         : UdpEntity:10.10.10.10:1161
    Community      : public
    Enterprise     : enterprises.140.625
    TrapOID        : enterprises.140.625.0.75
    RawTrapOID     : 1.3.6.1.4.1.140.625.0.75
    Trap Objects   : {
   { enterprises.140.625.100.5=Mon Mar 21 20:17:11 IST 2011 }
   { enterprises.140.625.100.10=MS-1 }
   { enterprises.140.625.100.55=jmx.monitor.error.type }
   { enterprises.140.625.100.60=null }
   { enterprises.140.625.100.65=null }
   { enterprises.140.625.100.70=com.bea:Location=MS-1,Name=MS-1,Type=ServerRuntime }
   { enterprises.140.625.100.75=ServerRuntime }
   { enterprises.140.625.100.80=HealthState }
}
    Raw VarBinds   : {
   { enterprises.140.625.100.5=Mon Mar 21 20:17:11 IST 2011 }
   { enterprises.140.625.100.10=MS-1 }
   { enterprises.140.625.100.55=jmx.monitor.error.type }
   { enterprises.140.625.100.60=null }
   { enterprises.140.625.100.65=null }

.
.
Thanks
Jay SenSharma


Analyzing Garbage Collection Log

Hi,
Jay SenSharma

Jay SenSharma

It’s always best to enable the Garbage collection Logging in our production environment as well because it does not cause any resource overhead or any side effect on weblogic server or an other application server’s performance.  GC log helps us in investigating man issues. Apart from issues it helps us to find out if some tuning is required based on the statistics of the Garbage collection.
.
Garbage collection logging can be enable and collected in a separate log file by using the following JAVA_OPTIONS:
-Xloggc:D:/gcLogs/GCLogs.log         -XX:+PrintGCDetails        -XX:+PrintGCTimeStamps
As soon as you add these JAVA_OPTIONS which are JVM specific (above will work for Sun and Open JDKs fine) the JVM will start generating the garbage collection logging in the GCLog.log file. Now if you will open this file then you can
see something like following:
4.636: [GC [PSYoungGen: 230400K->19135K(268800K)] 230400K->19135K(2058752K), 0.0635710 secs] [Times: user=0.08 sys=0.01, real=0.06 secs]
7.302: [GC [PSYoungGen: 249535K->38396K(268800K)] 249535K->51158K(2058752K), 0.0777300 secs] [Times: user=0.21 sys=0.04, real=0.07 secs]
7.521: [GC [PSYoungGen: 49735K->38388K(268800K)] 62496K->51933K(2058752K), 0.0741680 secs] [Times: user=0.15 sys=0.04, real=0.07 secs]
7.595: [Full GC (System) [PSYoungGen: 38388K->0K(268800K)] [PSOldGen: 13545K->51794K(1789952K)] 51933K->51794K(2058752K) [PSPermGen: 19868K->19868K(39936K)], 0.3066610 secs] [Times: user=0.28 sys=0.02, real=0.31 secs]
9.752: [GC [PSYoungGen: 230400K->26206K(268800K)] 282194K->78000K(2058752K), 0.0728380 secs] [Times: user=0.15 sys=0.00, real=0.08 secs]
11.906: [GC [PSYoungGen: 256606K->38393K(268800K)] 308400K->94759K(2058752K), 0.1058920 secs] [Times: user=0.19 sys=0.00, real=0.10 secs]
13.480: [GC [PSYoungGen: 268793K->38394K(268800K)] 325159K->109054K(2058752K), 0.0762360 secs] [Times: user=0.20 sys=0.03, real=0.08 secs]
18.115: [GC [PSYoungGen: 268794K->38384K(268800K)] 339454K->179238K(2058752K), 0.1351350 secs] [Times: user=0.42 sys=0.10, real=0.14 secs]
20.860: [GC [PSYoungGen: 268784K->38394K(268800K)] 409638K->200343K(2058752K), 0.1063430 secs] [Times: user=0.29 sys=0.03, real=0.11 secs]
22.148: [GC [PSYoungGen: 268794K->38399K(268800K)] 430743K->221395K(2058752K), 0.1173980 secs] [Times: user=0.24 sys=0.02, real=0.12 secs]
23.357: [GC [PSYoungGen: 268799K->26775K(268800K)] 451795K->231618K(2058752K), 0.0714130 secs] [Times: user=0.15 sys=0.03, real=0.08 secs]
24.449: [GC [PSYoungGen: 257175K->29170K(268800K)] 462018K->239909K(2058752K), 0.0312400 secs] [Times: user=0.06 sys=0.01, real=0.04 secs]
You can notice something in the above output:
Point-1). [Full GC (System) [PSYoungGen: 38388K->0K(268800K)]    It means a Full GC is happening on the complete Heap Area including all the Areas of the Java Heap Space.
.
Point-2). [GC [PSYoungGen: 230400K->19135K(268800K)]   Indicates some small GCs which keep on happening in the young generation very frequently,This garbage collection cleans the Young Generation short living Objects.
.
Point-3). Meaning of the [GC [PSYoungGen: 230400K->19135K(268800K)]   line is around 256MB (268800K) is the Young Generation Size, Before Garbage Collection in young generation the heap utilization in Young Generation area was around  255MB (230400K)  and after garbage collection it reduced up to 18MB (19135K)
.
Point-4). Same thing we can see for Full Garbage collection as well….How effective the Garbage collection was…[Full GC (System) [PSYoungGen: 38388K->0K(268800K)] [PSOldGen: 13545K->51794K(1789952K)]  Here it says that around
[(old)1789952K +  young (268800K) ]  memory space means  OldGeneration is consuming 1.75GB space and Young Generation is consuming around 255 MB space  So it means total Heap size is around 2GB.
.
But analyzing the Garbage collection log like above technique Line by Line is very bad…so here we have an alternative was to analyze the Garbage Collection log in few Seconds to see how much time the Full Garbage collection is taking as an average and other reports…etc.
.
Step1). Download the “garbagecat-1.0.0.jar   (881 KB) ”  tool from the follwing link: http://garbagecat.eclipselabs.org.codespot.com/files/garbagecat-1.0.0.jar
.
Step2). Open a command prompt and then make sure that JAVA is set in the Path so that we can use “jar” utility of JDK to run the “garbagecat-1.0.0.jar”  tool.
.
Step3). Put the “garbagecat-1.0.0.jar”  file and the “GCLog.log” file in the same directory. then run the following command:
java      -jar      garbagecat-1.0.0.jar      GCLog.log
.
Step4). As soon as ou run the above command you will see that in your current directory following files are created:
garbagecat-1.0.0.jar
GCLog.log
gcdb.lck
gcdb.log
gcdb.properties
report.txt
.
Step5). Now open the “report.txt” file to see the Over all report of the Garbage Collection something like following:
========================================
SUMMARY:
========================================
# GC Events: 12
GC Event Types: PARALLEL_SCAVENGE, PARALLEL_SERIAL_OLD
Max Heap Space: 2058752K
Max Heap Occupancy: 462018K
Max Perm Space: 39936K
Max Perm Occupancy: 19868K
Throughput: 95%
Max Pause: 306 ms
Total Pause: 1233 ms
First Timestamp: 4636 ms
Last Timestamp: 24449 ms
========================================
.
If you see that the Garbage Collection Max Pause time is very high like  more than 5-7 Seconds for a 2 GB heap then you need to worry about it. 😉
NOTE: Garbagecat  is a best utility to generate the Garbage Collection Report for Sun JDK and Open JDK for other JDKs you should use other tools for accurate results.
.
.
Thanks
Jay SenSharma

Copyright © 2010-2012 Middleware Magic. All rights reserved. |