Alert

Sending Email Alert For Stuck Threads With Thread Dumps

Ravish Mody

The most basic need for almost every 24×7 production environment to keep on monitoring the Server Health and the activities of Threads. And the very common need is to collect the Thread Dump  as soon as [STUCK] Thread occurs in any of the WebLogic Server. Most of the time we do post-mortem analysis , Means when the Stuck Thread issues occurred or WebLogic Server  hanged we could not collect the Thread Dumps to investigate. The Support always has this excuse.

While being in Middleware Support many times we face the same problem that the customers could not collect the thread dumps during the issue was occurring, which always delayed the resolution time and to find out the root cause of the actual issue.

To avoid above kind of issues we developed an Automatic WLST Script which has the following features in it:

Features Of This  Script:

  1. Ready To Use: The Script is ready to use, it means you need not to edit anything in the WLST script except the email address in line-32.
  2. Flexibility: You need to just change the values present in “domains.properties” file like how many Thread Dumps you want to collect when the issue occurs.
  3. E-Mail Alert: The Administrator will get to know regarding the issue via an E-Mail alert immediately.
  4. Thread Dumps In Mail: The complete Thread Dumps will we sent to the Administrator via the E-Mail so need not to worry about collecting the Thread Dumps.
  5. Independent Script: This WLST script can run independently without the help of any Cron-Job utility provided by the operating System (But it can be associated with the Cron-Job utility as well) So it provides more flexibility to the Administrators.

Steps to Create an Email Alert For Stuck Threads With Thread Dumps

Step1) Create a Directory somewhere in your file system like : “C:WLST”

Step2) Write a Properties file “domains.properties” inside “C:WLST” like following:

server.url=t3://localhost:8004
admin.username=weblogic
admin.password=weblogic
monitoring.server.name=MS-3

# This ExecuteThread_Vs_HoggerThreadRatio represtents the division of ExecuteThread/HoggerThreadRatio
ExecuteThread_Vs_HoggerThreadRatio=2

# Number of times the RATIO has to be checked
checkTimes_Number=3

# TIME INTERVAL between number of times the RATIO has to be checked (60000 milliseconds = 60 seconds)
checkInterval_in_Milliseconds=60000

# Number of times the Thread Dump has to be taken
threadDumpTimes_Number=5

# TIME INTERVAL between each thread number has to be taken (10000 milliseconds = 10 seconds)
threadDumpInterval_in_Milliseconds=10000

# Number of times to send thread dumps in mail is case of stuck/hogging thread issue
sendEmail_ThreadDump_Counter=2

Step-3) Create a WLST Script somewhere in your file system with some name like “Alert_StuckThread_ThreadDumps.py” inside “C:WLST” contents will be something like following:

#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################

from java.io import FileInputStream
import java.lang
import os

propInputStream = FileInputStream("domains.properties")
configProps = Properties()
configProps.load(propInputStream)

adminUrl = configProps.get("server.url")
adminUser = configProps.get("admin.username")
adminPassword = configProps.get("admin.password")
monitoringServerName = configProps.get("monitoring.server.name")

executeThread_Vs_HoggerThreadRatio = configProps.get("ExecuteThread_Vs_HoggerThreadRatio")
checkTimes_Number = configProps.get("checkTimes_Number")
checkInterval_in_Milliseconds = configProps.get("checkInterval_in_Milliseconds")
threadDumpTimes_Number = configProps.get("threadDumpTimes_Number")
threadDumpInterval_in_Milliseconds = configProps.get("threadDumpInterval_in_Milliseconds")
sendEmail_ThreadDump_Counter = configProps.get("sendEmail_ThreadDump_Counter")

i = 0
y = int(checkTimes_Number)

#############  This method would send the Alert Email with Thread Dump  #################
def sendMailThreadDump():
	os.system('/bin/mailx -s  "ALERT: CHECK Thread Dumps as Hogger Thread Count Exceeded the Limt !!! " abcd@company.com < All_ThreadDump.txt')
	print '*********  ALERT MAIL HAS BEEN SENT  ***********'
	print ''

#############  This method is checking the Hogger Threads Ratio  #################
def alertHoggerThreads(executeTTC , hoggerTC):
	print 'Execute Threads : ', executeTTC
	print 'Hogger Thread Count : ', hoggerTC
	print 'executeThread_Vs_HoggerThreadRatio :', executeThread_Vs_HoggerThreadRatio
	if hoggerTC != 0:
		ratio=(executeTTC/hoggerTC)
		print 'Ratio : ' , ratio
		print ''
		if (int(ratio) <= int(executeThread_Vs_HoggerThreadRatio)):
			print ' !!!! ALERT !!!! Stuck Threads are on its way.....'
			print ''
			message =  'ExecuteThreads Count= ' + str(executeTTC) + '   HoggingThreads= '+ str(hoggerTC) +'   ExecuteThreads/HoggingThreads Ratio= '+ str(ratio)
			cmd = "echo " + message +" > rw_file"
			os.system(cmd)
			genrateThreadDump()
		else:
			print '++++++++++++++++++++++++++++++++++++'
			print 'Everything is working fine till now'
			print '++++++++++++++++++++++++++++++++++++'
	else:
		print '++++++++++++++++++++++++++++++++++++'
		print 'Everything is working fine till now'
		print '++++++++++++++++++++++++++++++++++++'

#############  This method is Taking the Thread Dumps #################
def genrateThreadDump():
	b = int(sendEmail_ThreadDump_Counter)
	a = 0
	p = 0
	q = int(threadDumpTimes_Number)
	serverConfig()
	cd ('Servers/'+ monitoringServerName)
	while (p < q):
		if a < b:
			print 'Taking Thread Dump : ', p
			threadDump()
			cmd = "cat Thread_Dump_MS-3.txt >> All_ThreadDump.txt"
			os.system(cmd)
			print 'Thread Dump Collected : ', p ,' now Sleeping for ', int(threadDumpInterval_in_Milliseconds) , ' Seconds ...'
			print ''
			Thread.sleep(int(checkInterval_in_Milliseconds))
			b = b - 1
			p = p + 1
	sendMailThreadDump()
	cmd = "rm -f All_ThreadDump.txt"
	os.system(cmd)
	serverRuntime()

connect(adminUser,adminPassword,adminUrl)
serverRuntime()
cd('ThreadPoolRuntime/ThreadPoolRuntime')

while (i < y):
	executeTTC=cmo.getExecuteThreadTotalCount();
	hoggerTC=cmo.getHoggingThreadCount();
	alertHoggerThreads(executeTTC , hoggerTC)
	print 'Sleeping for ', int(checkInterval_in_Milliseconds) , ' ...'
	print ''
	Thread.sleep(int(checkInterval_in_Milliseconds))
	i = i + 1

Step-4) Open a command prompt and then run the “setWLSEnv.cmd” or “setWLSEnv.sh” to set the CLASSPATH and PATH variables. Better you do echo %CLASSPATH% or echo $CLASSPATH to see whether the CLASSPATH is set properly or not. If you see an Empty Classpath even after running the “setWLSEnv.sh” then please refer to the Note mentioned at Step3) in the Following post: http://middlewaremagic.com/weblogic/?page_id=1492

Step-5) Now run the WLST Script in the same command prompt using the following command:

java   weblogic.WLST  Alert_StuckThread_ThreadDumps.py

You will see the following kind of results in the command prompt

$ java weblogic.WLST Alert_StuckThread_ThreadDumps.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Connecting to t3://localhost:8004 with userid weblogic ...
Successfully connected to managed Server 'MS-3' that belongs to domain 'Domain_8001'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)

Execute Threads :  5
Hogger Thread Count :  2
executeThread_Vs_HoggerThreadRatio : 2
Ratio :  2

 !!!! ALERT !!!! Stuck Threads are on its way.....

Taking Thread Dump :  0
Thread dump for the running server: MS-3
"[STANDBY] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'" waiting for lock weblogic.work.ExecuteThread@1e2ba602 WAITING
	java.lang.Object.wait(Native Method)
	java.lang.Object.wait(Object.java:485)
	weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:157)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:178)

"[STANDBY] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)'" waiting for lock weblogic.work.ExecuteThread@439fdcc7 WAITING
	java.lang.Object.wait(Native Method)
	java.lang.Object.wait(Object.java:485)
	weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:157)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:178)

"DynamicListenThread[Default[2]]" RUNNABLE native
	java.net.PlainSocketImpl.socketAccept(Native Method)
	java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
	java.net.ServerSocket.implAccept(ServerSocket.java:453)
	java.net.ServerSocket.accept(ServerSocket.java:421)
	weblogic.socket.WeblogicServerSocket.accept(WeblogicServerSocket.java:38)
	weblogic.server.channels.DynamicListenThread$SocketAccepter.accept(DynamicListenThread.java:523)
	weblogic.server.channels.DynamicListenThread$SocketAccepter.access$200(DynamicListenThread.java:415)
	weblogic.server.channels.DynamicListenThread.run(DynamicListenThread.java:166)
	java.lang.Thread.run(Thread.java:619)

"[STUCK] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'" RUNNABLE
	sun.misc.FloatingDecimal.doubleValue(FloatingDecimal.java:1531)
	java.lang.Double.parseDouble(Double.java:510)
	jsp_servlet.__index._jspService(__index.java:71)
	weblogic.servlet.jsp.JspBase.service(JspBase.java:34)
	weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
	weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
	weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292)
	weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175)
	weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3498)
	weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
	weblogic.security.service.SecurityManager.runAs(Unknown Source)
	weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2180)
	weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2086)
	weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1406)
	weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:173)

NOTE: This script is using mailx (i.e. but Windows box does not have mailx utility) so please do check if your mailx is configured properly or else script would run properly but the mail would not be sent.

Alert Email

Alert Email

.

Regards,

Ravish Mody


Sending Email Alert for WebLogic DataSource Monitoring State & Connections Usages

Hi,

Jay SenSharma

Jay SenSharma

As almost in every production environment monitoring the resources is the most important thing. We have started a chain of posts related to getting Alerts whenever something goes wrong or the WebLogic Application Server starts behaving abnormal. In the current demo we are going to see how to get an E-Mail alert whenever the DataSource is not “Running”  or the “ActiveConnectionCurrentCount” crosses the limit specified by us in the  Properties file.

This script will monitor A perticular instanace of WebLogic Server and the dataSource which is targeted to that particular server.  This WLST script can be further enhanced to monitor more than 1 WebLogic Server Instance by putting the complete Script inside a for Loop. For Simplicity of the concept we are doing it only on a single Server.

Note: Below script can be executed as a Cron Job at a regular interval of lets say 10 minutes or 5 minutes based on the requirement.  This WLST Script is supported from WLS 9.x onwards .

Steps to Create an Email Alert for WebLogic DataSource Monitoring State & Connections Usages

Step1) Create a Directory somewhere in your file system like : “C:WLST”

Step2) Write a Properties file “details.properties” inside “C:WLST” like following:

#### Datasource.targetServer.url   Is the URL of the WebLogic Server in which the DataSource is Targeted.
datasource.targetServer.url=t3://10.10.10.10:9001
admin.username=weblogic
admin.password=weblogic

#### connectionPool.alert.limit    is the total number of Used Connections after which you want to get an alert
connectionPool.alert.limit=12
.

Step-3) Create a WLST Script somewhere in your file system with some name like “monitorDS.py” inside “C:WLST” contents will be something like following:

#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################
from java.io import FileInputStream
import java.lang
import os
import string

propInputStream = FileInputStream("details.properties")
configProps = Properties()
configProps.load(propInputStream)

targetServerURL = configProps.get("datasource.targetServer.url")
adminUser = configProps.get("admin.username")
adminPassword = configProps.get("admin.password")
connectionPoolAlertLimit = configProps.get("connectionPool.alert.limit")

#############  This method would send the Alert Email  #################
def sendMailString():
	os.system('/bin/mailx -s  "ALERT: Connection Pool Health is in WARNING !!! " abcd@company.com < commectionLimit_file')
	print ''

def checkConnectionUsage(activeConnectionsCurrentCount,dataSourceName,dataSourceState):
	dsStateSendMail="yes"
	dsLimitSendMail="yes"
	state=" DataSourceName: " + dataSourceName
	check = string.find(dataSourceState,"Running")
	print 'Zero 0 Means Running :  Check State ' , str(check)
	if check != 0:
		state = state + " Checking Connection Pool HealthState: " + dataSourceState
		print '!!!! ALERT !!!! Connection Pool Health is in Not OK Sending E-Mail Alert.'
	else:
		dsStateSendMail="no"

	if int(activeConnectionsCurrentCount) >= int (connectionPoolAlertLimit):
		state = state + " Connection Pool Is Crossing the Alert Limit ActiveConnectionsCurrentCount = " + str(activeConnectionsCurrentCount)
		print '!!!! ALERT !!!! Connection Pool Connection Pool Is Crossing The Alert Limit.'
		print ''
	else:
		dsLimitSendMail="no"

	checkData_A = string.find(dsStateSendMail,"yes")
	checkData_B = string.find(dsLimitSendMail,"yes")
	if  ((checkData_A==0) | (checkData_B==0)):
		cmd = "echo " + state +" > commectionLimit_file"
		os.system(cmd)
		sendMailString()
	else:
		print 'Every Thing in Pool Is OK For DataSource ', dataSourceName

#############  Main Execution @ Middleware Magic 2010 #################
connect(adminUser, adminPassword, targetServerURL)
serverRuntime()
dsMBeans = cmo.getJDBCServiceRuntime().getJDBCDataSourceRuntimeMBeans()
for ds in dsMBeans:
	print 'DS name is: '+ds.getName()
	print 'State is ' +ds.getState()
	dataSourceName=ds.getName()
	dataSourceState=ds.getState()
	# ActiveConnectionsCurrentCount: Means The number of connections currently in use by applications.
	activeConnectionsCurrentCount=ds.getActiveConnectionsCurrentCount()
	currCapacity=ds.getCurrCapacity()
	print 'DS ActiveConnectionsCurrentCount: ', activeConnectionsCurrentCount
	checkConnectionUsage(activeConnectionsCurrentCount,dataSourceName,dataSourceState)
	print '------------------------------------'

Step-4) Open a command prompt and then run the “setWLSEnv.cmd” or “setWLSEnv.sh” to set the CLASSPATH and PATH variables. Better you do echo %CLASSPATH% or echo $CLASSPATH to see whether the CLASSPATH is set properly or not. If you see an Empty Classpath even after running the “setWLSEnv.sh” then please refer to the Note mentioned at Step3) in the Following post:

http://middlewaremagic.com/weblogic/?page_id=1492

Step-5) Now run the WLST Script in the same command prompt like following:

<br />
java weblogic.WLST monitorDS_WORKING.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Connecting to t3://10.10.10.10:9001 with userid weblogic ...
Successfully connected to Admin Server 'AdminServer' that belongs to domain 'base_domain'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)

DS name is: SQLAuthDS
State is Suspended
DS ActiveConnectionsCurrentCount:  0
Zero 0 Means Running :  Check State  -1
!!!! ALERT !!!! Connection Pool Health is in Not OK Sending E-Mail Alert.

-----
DS name is: PointBaseDataSource
State is Running
DS ActiveConnectionsCurrentCount:  12
Zero 0 Means Running :  Check State  0
!!!! ALERT !!!! Connection Pool Connection Pool Is Crossing The Alert Limit.

-----

NOTE: This script is using mailx (i.e. but Windows box does not have mailx utility) so please do check if your mailx is configured properly or else script would run properly but the mail would not be sent.

ConnectionPool LimitExceeded

ConnectionPool LimitExceeded

DataSource State

DataSource State

.

.

Thanks

Jay SenSharma


Sending Email Alert for Threads Pool Health Using WLST

Ravish Mody

This post is written considering to send an email alert message to Admin’s which would alert them about the Thread Pool Health State. All of us know if in Weblogic Health State is as WARNING state then it means something is not going correct in the server and we have to check it. Thus this WLST script would surely help all the Weblogic Administrators who are working in Production Environment.

In below  script has a properties file in which you can give all the details about the domain as well as time interval and the number of times the ratio has to be checked.

.

Steps to Create an Email Alert for Threads Pool Health

Step1) Create a Directory somewhere in your file system like : “C:WLST”

Step2) Write a Properties file “domains.properties” inside “C:WLST” like following:

admin.url=t3://localhost:8001
admin.username=weblogic
admin.password=weblogic

# Number of times the RATIO has to be checked
checkTimes_Number=25

# TIME INTERVAL between number of times the RATIO has to be checked (30000 milliseconds = 30 seconds)
checkInterval_in_Milliseconds=30000

############ Accouding to the above values the checker will run for total 25 times in an interval of 30 seconds each. #############

Step-3) Create a WLST Script somewhere in your file system with some name like “Alert_ThreadPoolHealth.py” inside “C:WLST” contents will be something like following:

#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################

from java.io import FileInputStream
import java.lang
import os
import string

propInputStream = FileInputStream("domains.properties")
configProps = Properties()
configProps.load(propInputStream)

serverurl = configProps.get("server.url")
adminUser = configProps.get("admin.username")
adminPassword = configProps.get("admin.password")
checkTimes_Number = configProps.get("checkTimes_Number")
checkInterval_in_Milliseconds = configProps.get("checkInterval_in_Milliseconds")

i = 0
y = int(checkTimes_Number)

#############  This method would send the Alert Email  #################
def sendMailString():
	os.system('/bin/mailx -s  "ALERT: Thread Pool Health is in WARNING !!! " ABCD@COMPANY.com < rw_file')
	print '*********  ALERT MAIL HAS BEEN SENT  ***********'
	print ''

#############  This method is checking the Thread Pool Health   #################
def alertThreadPoolHealth(healthState):
	state ="Checking HealthState: " + healthState
	check = string.find(state,"HEALTH_WARN")
	if check != -1:
		print '!!!! ALERT !!!! Thread Pool Health is in WARNING State'
		print ''
		message =  'Please Check the Thread Pool is in WARNING State.'
		cmd = "echo " + message +" > rw_file"
		os.system(cmd)
		sendMailString()
	else:
		print 'Everything is working fine till now'

connect(adminUser,adminPassword,serverurl)
serverRuntime()
cd('ThreadPoolRuntime/ThreadPoolRuntime')

while (i < y):
	ls()
	healthState=cmo.getHealthState();
	alertThreadPoolHealth(str(healthState));
	Thread.sleep(int(checkInterval_in_Milliseconds))
	i = i + 1

Step-4) Open a command prompt and then run the “setWLSEnv.cmd” or “setWLSEnv.sh” to set the CLASSPATH and PATH variables. Better you do echo %CLASSPATH% or echo $CLASSPATH to see whether the CLASSPATH is set properly or not. If you see an Empty Classpath even after running the “setWLSEnv.sh” then please refer to the Note mentioned at Step3) in the Following post: http://middlewaremagic.com/weblogic/?page_id=1492

Step-5) Now run the WLST Script in the same command prompt like following:

java   weblogic.WLST  Alert_ThreadPoolHealth.py

You will see the following kind of results in the command prompt

java weblogic.WLST Alert_ThreadPoolHealth.py

Initializing WebLogic Scripting Tool (WLST) ...
Welcome to WebLogic Server Administration Scripting Shell
Type help() for help on available commands

Connecting to t3://localhost:8001 with userid weblogic ...
Successfully connected to Admin Server 'AdminServer' that belongs to domain 'Domain_8001'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)
-r--   CompletedRequestCount                        3714
-r--   ExecuteThreadIdleCount                       0
-r--   ExecuteThreadTotalCount                      6
-r--   ExecuteThreads                               weblogic.work.ExecuteThreadRuntime[weblogic.work.ExecuteThreadRuntime@39c4e135, weblogic.work.ExecuteThreadRuntime@2c9d7c34, weblogic.work.ExecuteThreadRuntime@174550ce, weblogic.work.ExecuteThreadRuntime@47bc1051, weblogic.work.ExecuteThreadRuntime@bd35aa2, weblogic.work.ExecuteThreadRuntime@60e347be]
-r--   HealthState                                  Component:threadpool,State:HEALTH_WARN,MBean:ThreadPoolRuntime,ReasonCode:[ThreadPool has stuck threads]
-r--   HoggingThreadCount                           2
-r--   MinThreadsConstraintsCompleted               94
-r--   MinThreadsConstraintsPending                 0
-r--   Name                                         ThreadPoolRuntime
-r--   PendingUserRequestCount                      0
-r--   QueueLength                                  0
-r--   SharedCapacityForWorkManagers                65536
-r--   StandbyThreadCount                           3
-r--   Suspended                                    false
-r--   Throughput                                   9.495252373813093
-r--   Type                                         ThreadPoolRuntime
-r-x   preDeregister                                Void :

!!!! ALERT !!!! Thread Pool Health is in WARNING State

*********  ALERT MAIL HAS BEEN SENT  ***********

NOTE: This script is using mailx (i.e. but Windows box does not have mailx utility) so please do check if your mailx is configured properly or else script would run properly but the mail would not be sent.

Alert Email

Alert Email

Regards,

Ravish Mody


Copyright © 2010-2012 Middleware Magic. All rights reserved. |