Monitoring

WebLogic SNMP Monitoring for Server Health State

Hi,

Jay SenSharma

Jay SenSharma

This post was created as one of our subscriber Shashi.rsb was facing an issue in creating a SNMP trap for server health state, hence I have created this post. Here is a very simple demonstration and configuration to receive the SNMP traps for the health state of a Managed Server running as part of our WebLogic Domain. In this demo we will see how we can configure the SNMP Trap and SNMP Agent to provide us the  Health State Information of a WebLogic Server. Usually if the Server’s Health state is HEALTH_OK,

At any point of time our WebLogic server may have in any of the following Health States:

public static final int HEALTH_OK;
public static final int HEALTH_WARN;
public static final int HEALTH_CRITICAL;
public static final int HEALTH_FAILED;
public static final int HEALTH_OVERLOADED;

So based on your requirement you can choose any Health State for SNMP monitoring … I am using HEALTH_OK for this demo.

Also Notice a kind of BUG:  http://middlewaremagic.com/weblogic/?p=6149#comment-3264

Step1). Start WeLogic Admin Server and login to Admin Console and then create a SNMP Agent like following:

SNMP_HealthState_1

SNMP_HealthState_1

Step2). Now create a String Monitor  By Clicking on   “Diagnostics —> SNMP—> ServerSNMPAgent-0—-> String Monitors”

SNMP_HealthState_2

SNMP_HealthState_2

Step3). Now create a Trap Destination “Diagnostics —> SNMP—> ServerSNMPAgent-0—-> Trap Destinations”

SNMP_HealthState_3

SNMP_HealthState_3

Step4). After making all the above configurations  your “”  file will look ike following:

  <snmp-agent-deployment>
    <name>ServerSNMPAgent-0</name>
    <enabled>true</enabled>
    <send-automatic-traps-enabled>true</send-automatic-traps-enabled>
    <snmp-port>1161</snmp-port>
    <snmp-trap-version>1</snmp-trap-version>
    <community-prefix>public</community-prefix>
    <snmp-trap-destination>
      <name>TrapDestination-0</name>
      <host>10.10.10.10</host>
      <port>1165</port>
      <community>public</community>
      <security-level>noAuthNoPriv</security-level>
    </snmp-trap-destination>
    <snmp-string-monitor>
      <name>SNMPStringMonitor-1</name>
      <enabled-server>AdminServer,MS-1</enabled-server>
      <monitored-m-bean-type>ServerRuntime</monitored-m-bean-type>
      <monitored-m-bean-name>MS-1</monitored-m-bean-name>
      <monitored-attribute-name>HealthState</monitored-attribute-name>
      <polling-interval>10</polling-interval>
      <string-to-compare>HEALTH_OK</string-to-compare>
      <notify-differ>false</notify-differ>
      <notify-match>false</notify-match>
    </snmp-string-monitor>
    <community-based-access-enabled>true</community-based-access-enabled>
    <snmp-engine-id>ServerSNMPAgent-0</snmp-engine-id>
    <authentication-protocol>noAuth</authentication-protocol>
    <privacy-protocol>noPriv</privacy-protocol>
    <inform-retry-interval>10000</inform-retry-interval>
    <max-inform-retry-count>1</max-inform-retry-count>
    <localized-key-cache-invalidation-interval>3600000</localized-key-cache-invalidation-interval>
    <snmp-access-for-user-m-beans-enabled>false</snmp-access-for-user-m-beans-enabled>
    <inform-enabled>false</inform-enabled>
    <master-agent-x-port>1705</master-agent-x-port>
    <target>AdminServer</target>
  </snmp-agent-deployment>

Step5). Restart your AdminServer.

Step6). Now start the  ”SnmpTrapMonitor like following: (Make Sure to run the “. ./setWLSEnv.sh” to set the environment in the same shell prompt where you are planning to run the following command)

java   weblogic.diagnostics.snmp.cmdline.Manager   SnmpTrapMonitor   -p   1165

Step7). Now as soon as you will start your Managed Server you will get the following kind of Trap in the above Shell/Command prompt  once the ManagedServer (MS-1) have the HEALTH_OK state:

java weblogic.diagnostics.snmp.cmdline.Manager SnmpTrapMonitor -p 1165
Listening on port:1165
--- Snmp Trap Received ---
    Version        : v1
    Source         : UdpEntity:10.10.10.10:1161
    Community      : public
    Enterprise     : enterprises.140.625
    TrapOID        : enterprises.140.625.0.65
    RawTrapOID     : 1.3.6.1.4.1.140.625.0.65
    Trap Objects   : {
   { enterprises.140.625.100.5=Mon Mar 21 20:17:02 IST 2011 }
   { enterprises.140.625.100.10=MS-1 }
}
    Raw VarBinds   : {
   { enterprises.140.625.100.5=Mon Mar 21 20:17:02 IST 2011 }
   { enterprises.140.625.100.10=MS-1 }
}
--- Snmp Trap Received ---
    Version        : v1
    Source         : UdpEntity:10.10.10.10:1161
    Community      : public
    Enterprise     : enterprises.140.625
    TrapOID        : enterprises.140.625.0.75
    RawTrapOID     : 1.3.6.1.4.1.140.625.0.75
    Trap Objects   : {
   { enterprises.140.625.100.5=Mon Mar 21 20:17:11 IST 2011 }
   { enterprises.140.625.100.10=MS-1 }
   { enterprises.140.625.100.55=jmx.monitor.error.type }
   { enterprises.140.625.100.60=null }
   { enterprises.140.625.100.65=null }
   { enterprises.140.625.100.70=com.bea:Location=MS-1,Name=MS-1,Type=ServerRuntime }
   { enterprises.140.625.100.75=ServerRuntime }
   { enterprises.140.625.100.80=HealthState }
}
    Raw VarBinds   : {
   { enterprises.140.625.100.5=Mon Mar 21 20:17:11 IST 2011 }
   { enterprises.140.625.100.10=MS-1 }
   { enterprises.140.625.100.55=jmx.monitor.error.type }
   { enterprises.140.625.100.60=null }
   { enterprises.140.625.100.65=null }

.
.
Thanks
Jay SenSharma


Sending Email Alert for WebLogic Servers Current State

Hi,

Jay SenSharma

Jay SenSharma

In every production environment we want to keep monitoring the State of our WebLogic Server, Like RUNNING, ADMIN, etc. Usually we face problem in getting the Status report, because most of the tools (Non-Operating System related) we know to accomplsh this task but those tools requires Admin Server to be UP and RUNNING to provide us the Current Status details of the Servers.

While running this WLST script you will get  “Caused by: java.rmi.ConnectException: Destination unreachable;”  which is Normal and expected if the Weblogic Server will not be running

So here we provide a simple WLST script which has the following features:

Features:

  1. Ready To Use: The Script do not require any kind of changes in the WLST Script, except the Administrators E-Mail address at line number: 24
  2. Standalone: You can use Check Server State At the mentioned “check.interval” interval by just changing the properties file. No need to use any Cron Job.
  3. AdminServer Independent: AdminServer Need not be running to get the Server State using thei Script.
  4. Reliable: Compared to the Cron Jobs which usually checks the Server Log and grep the Server State, This WLST script is more reliable.

Steps to Configure an Email Alert for Checking State Of Server

Step1) Create a Directory somewhere in your file system like :  “C:WLST”

Step2) Write a Properties file “domain.properties” inside “C:WLST” like following:

admin.username=weblogic
admin.password=weblogic

total.number.of.servers=3

server.name.1=AdminServer
server.url.1=t3://10.10.10.10:7001

server.name.2=MS-1
server.url.2=t3://10.10.10.10:7003

server.name.3=MS-2
server.url.3=t3://10.10.10.10:7005

# Check Server State At following interval in Seconds to check the State of the Server
check.interval=10

Step-3) Create a WLST Script somewhere in your file system with some name like “serverStateChecker.py” inside “C:WLST” contents will be something like following:

#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################
from java.util import Date
from java.io import FileInputStream
import java.lang
import os
import string

propInputStream = FileInputStream("domain.properties")
configProps = Properties()
configProps.load(propInputStream)

adminUser = configProps.get("admin.username")
adminPassword = configProps.get("admin.password")
checkInterval = configProps.get("check.interval")
totalServersToMonitor = configProps.get("total.number.of.servers")
checkingIntervalSeconds = int(checkInterval)

#############  This method would send the Alert Email  #################
def sendMailString():
	os.system('/bin/mailx -s  "ALERT: Check Server May Not Be RUNNING !!! Please check..." admin@company.com < serverState_file')
	print '*********  ALERT MAIL HAS BEEN SENT FOR SERVER STATE ***********'
	print ''

#############  Infinite Loop to check the Status of Server in Mentioned Interval  #################
while true:
	print 'Checking All Servers State Details'
	totalServers = int(totalServersToMonitor)
	i=1
	while i <= totalServers:
		disconnect()
		serverState=""
		serverName = configProps.get("server.name." + str(i))
		serverURL = configProps.get("server.url." + str(i))
		try:
			connect(adminUser,adminPassword,serverURL)
			serverRuntime()
			serverState=cmo.getState()
			print '-----------------', serverName , ' is in State: ', serverState
			if serverState != "RUNNING":
				today = Date()
				stateMessage = 'The ' + serverName + ' is In State ' + serverState + '  At Time: ' + today.toString()
				cmd = "echo " + stateMessage +" >> serverState_file"
				os.system(cmd)
		except:
			serverName=configProps.get("server.name." + str(i))
			print 'Sorry !!! Unable to Connect to Server ' , serverName
			today = Date()
			stateMessage = 'The ' + serverName + ' May Be DOWN.' + ' At Time: ' + today.toString()
			cmd = "echo " + stateMessage +" >> serverState_file"
			os.system(cmd)
		i =  i + 1

	sendMailString()
	cmd = "rm -f serverState_file"
	os.system(cmd)

	print 'Sleeping for ', int(checkingIntervalSeconds) , ' Seconds...'
	print ''
	interval=int(checkingIntervalSeconds)
	Thread.sleep(interval*1000)

#######################################################################

Step-4) Open a command prompt and then run the “setWLSEnv.cmd” or “setWLSEnv.sh” to set the CLASSPATH and PATH variables. Better you do echo %CLASSPATH% or echo $CLASSPATH to see whether the CLASSPATH is set properly or not. If you see an Empty Classpath even after running the “setWLSEnv.sh” then please refer to the Note mentioned at Step3) in the Following post: http://middlewaremagic.com/weblogic/?page_id=1492

Step-5) Now run the WLST Script in the same command prompt using the following command:

java   weblogic.WLST  serverStateChecker.py

You will see the following kind of results in the command prompt

Initializing WebLogic Scripting Tool (WLST) ...
$ java weblogic.WLST serverStateChecker.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Checking All Servers State Details

You will need to be connected to a running server to execute this command

Connecting to t3://10.65.193.88:7001 with userid weblogic ...
This Exception occurred at Thu Feb 24 19:03:25 IST 2011.
javax.naming.CommunicationException [Root exception is java.net.ConnectException: t3://10.65.193.88:7001: Destination unreachable; nested exception is:
	java.net.ConnectException: Connection refused; No available router to destination]
	at weblogic.jndi.internal.ExceptionTranslator.toNamingException(ExceptionTranslator.java:40)
	at weblogic.jndi.WLInitialContextFactoryDelegate.toNamingException(WLInitialContextFactoryDelegate.java:783)
	at weblogic.jndi.WLInitialContextFactoryDelegate.getInitialContext(WLInitialContextFactoryDelegate.java:365)
	at weblogic.jndi.Environment.getContext(Environment.java:315)
	at weblogic.jndi.Environment.getContext(Environment.java:285)
.
.
.
.
Sorry !!! Unable to Connect to Server  AdminServer

You will need to be connected to a running server to execute this command

Connecting to t3://10.65.193.88:7003 with userid weblogic ...
Successfully connected to managed Server 'MS-1' that belongs to domain 'Test_Domain'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)

----------------- MS-1  is in State:  RUNNING
Disconnected from weblogic server: MS-1
Connecting to t3://10.10.10.10:7005 with userid weblogic ...
Successfully connected to managed Server 'MS-2' that belongs to domain 'Test_Domain'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

----------------- MS-2  is in State:  ADMIN
*********  ALERT MAIL HAS BEEN SENT FOR SERVER STATE ***********

Sleeping for  10  Seconds...

NOTE: This script is using mailx (i.e. but Windows box does not have mailx utility) so please do check if your mailx is configured properly or else script would run properly but the mail would not be sent.

Email Alert Server State

Email Alert Server State

.
.

Regards,
Jay SenSharma


Sending Email Alert for WebLogic DataSource Monitoring State & Connections Usages

Hi,

Jay SenSharma

Jay SenSharma

As almost in every production environment monitoring the resources is the most important thing. We have started a chain of posts related to getting Alerts whenever something goes wrong or the WebLogic Application Server starts behaving abnormal. In the current demo we are going to see how to get an E-Mail alert whenever the DataSource is not “Running”  or the “ActiveConnectionCurrentCount” crosses the limit specified by us in the  Properties file.

This script will monitor A perticular instanace of WebLogic Server and the dataSource which is targeted to that particular server.  This WLST script can be further enhanced to monitor more than 1 WebLogic Server Instance by putting the complete Script inside a for Loop. For Simplicity of the concept we are doing it only on a single Server.

Note: Below script can be executed as a Cron Job at a regular interval of lets say 10 minutes or 5 minutes based on the requirement.  This WLST Script is supported from WLS 9.x onwards .

Steps to Create an Email Alert for WebLogic DataSource Monitoring State & Connections Usages

Step1) Create a Directory somewhere in your file system like : “C:WLST”

Step2) Write a Properties file “details.properties” inside “C:WLST” like following:

#### Datasource.targetServer.url   Is the URL of the WebLogic Server in which the DataSource is Targeted.
datasource.targetServer.url=t3://10.10.10.10:9001
admin.username=weblogic
admin.password=weblogic

#### connectionPool.alert.limit    is the total number of Used Connections after which you want to get an alert
connectionPool.alert.limit=12
.

Step-3) Create a WLST Script somewhere in your file system with some name like “monitorDS.py” inside “C:WLST” contents will be something like following:

#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################
from java.io import FileInputStream
import java.lang
import os
import string

propInputStream = FileInputStream("details.properties")
configProps = Properties()
configProps.load(propInputStream)

targetServerURL = configProps.get("datasource.targetServer.url")
adminUser = configProps.get("admin.username")
adminPassword = configProps.get("admin.password")
connectionPoolAlertLimit = configProps.get("connectionPool.alert.limit")

#############  This method would send the Alert Email  #################
def sendMailString():
	os.system('/bin/mailx -s  "ALERT: Connection Pool Health is in WARNING !!! " abcd@company.com < commectionLimit_file')
	print ''

def checkConnectionUsage(activeConnectionsCurrentCount,dataSourceName,dataSourceState):
	dsStateSendMail="yes"
	dsLimitSendMail="yes"
	state=" DataSourceName: " + dataSourceName
	check = string.find(dataSourceState,"Running")
	print 'Zero 0 Means Running :  Check State ' , str(check)
	if check != 0:
		state = state + " Checking Connection Pool HealthState: " + dataSourceState
		print '!!!! ALERT !!!! Connection Pool Health is in Not OK Sending E-Mail Alert.'
	else:
		dsStateSendMail="no"

	if int(activeConnectionsCurrentCount) >= int (connectionPoolAlertLimit):
		state = state + " Connection Pool Is Crossing the Alert Limit ActiveConnectionsCurrentCount = " + str(activeConnectionsCurrentCount)
		print '!!!! ALERT !!!! Connection Pool Connection Pool Is Crossing The Alert Limit.'
		print ''
	else:
		dsLimitSendMail="no"

	checkData_A = string.find(dsStateSendMail,"yes")
	checkData_B = string.find(dsLimitSendMail,"yes")
	if  ((checkData_A==0) | (checkData_B==0)):
		cmd = "echo " + state +" > commectionLimit_file"
		os.system(cmd)
		sendMailString()
	else:
		print 'Every Thing in Pool Is OK For DataSource ', dataSourceName

#############  Main Execution @ Middleware Magic 2010 #################
connect(adminUser, adminPassword, targetServerURL)
serverRuntime()
dsMBeans = cmo.getJDBCServiceRuntime().getJDBCDataSourceRuntimeMBeans()
for ds in dsMBeans:
	print 'DS name is: '+ds.getName()
	print 'State is ' +ds.getState()
	dataSourceName=ds.getName()
	dataSourceState=ds.getState()
	# ActiveConnectionsCurrentCount: Means The number of connections currently in use by applications.
	activeConnectionsCurrentCount=ds.getActiveConnectionsCurrentCount()
	currCapacity=ds.getCurrCapacity()
	print 'DS ActiveConnectionsCurrentCount: ', activeConnectionsCurrentCount
	checkConnectionUsage(activeConnectionsCurrentCount,dataSourceName,dataSourceState)
	print '------------------------------------'

Step-4) Open a command prompt and then run the “setWLSEnv.cmd” or “setWLSEnv.sh” to set the CLASSPATH and PATH variables. Better you do echo %CLASSPATH% or echo $CLASSPATH to see whether the CLASSPATH is set properly or not. If you see an Empty Classpath even after running the “setWLSEnv.sh” then please refer to the Note mentioned at Step3) in the Following post:

http://middlewaremagic.com/weblogic/?page_id=1492

Step-5) Now run the WLST Script in the same command prompt like following:

<br />
java weblogic.WLST monitorDS_WORKING.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Connecting to t3://10.10.10.10:9001 with userid weblogic ...
Successfully connected to Admin Server 'AdminServer' that belongs to domain 'base_domain'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)

DS name is: SQLAuthDS
State is Suspended
DS ActiveConnectionsCurrentCount:  0
Zero 0 Means Running :  Check State  -1
!!!! ALERT !!!! Connection Pool Health is in Not OK Sending E-Mail Alert.

-----
DS name is: PointBaseDataSource
State is Running
DS ActiveConnectionsCurrentCount:  12
Zero 0 Means Running :  Check State  0
!!!! ALERT !!!! Connection Pool Connection Pool Is Crossing The Alert Limit.

-----

NOTE: This script is using mailx (i.e. but Windows box does not have mailx utility) so please do check if your mailx is configured properly or else script would run properly but the mail would not be sent.

ConnectionPool LimitExceeded

ConnectionPool LimitExceeded

DataSource State

DataSource State

.

.

Thanks

Jay SenSharma


Copyright © 2010-2012 Middleware Magic. All rights reserved. |