Ravish Mody

The most basic need for almost every 24×7 production environment to keep on monitoring the Server Health and the activities of Threads. And the very common need is to collect the Thread Dump  as soon as [STUCK] Thread occurs in any of the WebLogic Server. Most of the time we do post-mortem analysis , Means when the Stuck Thread issues occurred or WebLogic Server  hanged we could not collect the Thread Dumps to investigate. The Support always has this excuse.

While being in Middleware Support many times we face the same problem that the customers could not collect the thread dumps during the issue was occurring, which always delayed the resolution time and to find out the root cause of the actual issue.

To avoid above kind of issues we developed an Automatic WLST Script which has the following features in it:

Features Of This  Script:

  1. Ready To Use: The Script is ready to use, it means you need not to edit anything in the WLST script except the email address in line-32.
  2. Flexibility: You need to just change the values present in “domains.properties” file like how many Thread Dumps you want to collect when the issue occurs.
  3. E-Mail Alert: The Administrator will get to know regarding the issue via an E-Mail alert immediately.
  4. Thread Dumps In Mail: The complete Thread Dumps will we sent to the Administrator via the E-Mail so need not to worry about collecting the Thread Dumps.
  5. Independent Script: This WLST script can run independently without the help of any Cron-Job utility provided by the operating System (But it can be associated with the Cron-Job utility as well) So it provides more flexibility to the Administrators.

Steps to Create an Email Alert For Stuck Threads With Thread Dumps

Step1) Create a Directory somewhere in your file system like : “C:\WLST”

Step2) Write a Properties file “domains.properties” inside “C:\WLST” like following:

server.url=t3://localhost:8004
admin.username=weblogic
admin.password=weblogic
monitoring.server.name=MS-3

# This ExecuteThread_Vs_HoggerThreadRatio represtents the division of ExecuteThread/HoggerThreadRatio
ExecuteThread_Vs_HoggerThreadRatio=2

# Number of times the RATIO has to be checked
checkTimes_Number=3

# TIME INTERVAL between number of times the RATIO has to be checked (60000 milliseconds = 60 seconds)
checkInterval_in_Milliseconds=60000

# Number of times the Thread Dump has to be taken
threadDumpTimes_Number=5

# TIME INTERVAL between each thread number has to be taken (10000 milliseconds = 10 seconds)
threadDumpInterval_in_Milliseconds=10000

# Number of times to send thread dumps in mail is case of stuck/hogging thread issue
sendEmail_ThreadDump_Counter=2

Step-3) Create a WLST Script somewhere in your file system with some name like “Alert_StuckThread_ThreadDumps.py” inside “C:\WLST” contents will be something like following:

#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################

from java.io import FileInputStream
import java.lang
import os

propInputStream = FileInputStream("domains.properties")
configProps = Properties()
configProps.load(propInputStream)

adminUrl = configProps.get("server.url")
adminUser = configProps.get("admin.username")
adminPassword = configProps.get("admin.password")
monitoringServerName = configProps.get("monitoring.server.name")

executeThread_Vs_HoggerThreadRatio = configProps.get("ExecuteThread_Vs_HoggerThreadRatio")
checkTimes_Number = configProps.get("checkTimes_Number")
checkInterval_in_Milliseconds = configProps.get("checkInterval_in_Milliseconds")
threadDumpTimes_Number = configProps.get("threadDumpTimes_Number")
threadDumpInterval_in_Milliseconds = configProps.get("threadDumpInterval_in_Milliseconds")
sendEmail_ThreadDump_Counter = configProps.get("sendEmail_ThreadDump_Counter")

i = 0
y = int(checkTimes_Number)

#############  This method would send the Alert Email with Thread Dump  #################
def sendMailThreadDump():
	os.system('/bin/mailx -s  "ALERT: CHECK Thread Dumps as Hogger Thread Count Exceeded the Limt !!! " abcd@company.com < All_ThreadDump.txt')
	print '*********  ALERT MAIL HAS BEEN SENT  ***********'
	print ''

#############  This method is checking the Hogger Threads Ratio  #################
def alertHoggerThreads(executeTTC , hoggerTC):
	print 'Execute Threads : ', executeTTC
	print 'Hogger Thread Count : ', hoggerTC
	print 'executeThread_Vs_HoggerThreadRatio :', executeThread_Vs_HoggerThreadRatio
	if hoggerTC != 0:
		ratio=(executeTTC/hoggerTC)
		print 'Ratio : ' , ratio
		print ''
		if (int(ratio) <= int(executeThread_Vs_HoggerThreadRatio)):
			print ' !!!! ALERT !!!! Stuck Threads are on its way.....'
			print ''
			message =  'ExecuteThreads Count= ' + str(executeTTC) + '   HoggingThreads= '+ str(hoggerTC) +'   ExecuteThreads/HoggingThreads Ratio= '+ str(ratio)
			cmd = "echo " + message +" > rw_file"
			os.system(cmd)
			genrateThreadDump()
		else:
			print '++++++++++++++++++++++++++++++++++++'
			print 'Everything is working fine till now'
			print '++++++++++++++++++++++++++++++++++++'
	else:
		print '++++++++++++++++++++++++++++++++++++'
		print 'Everything is working fine till now'
		print '++++++++++++++++++++++++++++++++++++'

#############  This method is Taking the Thread Dumps #################
def genrateThreadDump():
	b = int(sendEmail_ThreadDump_Counter)
	a = 0
	p = 0
	q = int(threadDumpTimes_Number)
	serverConfig()
	cd ('Servers/'+ monitoringServerName)
	while (p < q):
		if a < b:
			print 'Taking Thread Dump : ', p
			threadDump()
			cmd = "cat Thread_Dump_MS-3.txt >> All_ThreadDump.txt"
			os.system(cmd)
			print 'Thread Dump Collected : ', p ,' now Sleeping for ', int(threadDumpInterval_in_Milliseconds) , ' Seconds ...'
			print ''
			Thread.sleep(int(checkInterval_in_Milliseconds))
			b = b - 1
			p = p + 1
	sendMailThreadDump()
	cmd = "rm -f All_ThreadDump.txt"
	os.system(cmd)
	serverRuntime()

connect(adminUser,adminPassword,adminUrl)
serverRuntime()
cd('ThreadPoolRuntime/ThreadPoolRuntime')

while (i < y):
	executeTTC=cmo.getExecuteThreadTotalCount();
	hoggerTC=cmo.getHoggingThreadCount();
	alertHoggerThreads(executeTTC , hoggerTC)
	print 'Sleeping for ', int(checkInterval_in_Milliseconds) , ' ...'
	print ''
	Thread.sleep(int(checkInterval_in_Milliseconds))
	i = i + 1

Step-4) Open a command prompt and then run the “setWLSEnv.cmd” or “setWLSEnv.sh” to set the CLASSPATH and PATH variables. Better you do echo %CLASSPATH% or echo $CLASSPATH to see whether the CLASSPATH is set properly or not. If you see an Empty Classpath even after running the “setWLSEnv.sh” then please refer to the Note mentioned at Step3) in the Following post: http://middlewaremagic.com/weblogic/?page_id=1492

Step-5) Now run the WLST Script in the same command prompt using the following command:

java   weblogic.WLST  Alert_StuckThread_ThreadDumps.py

You will see the following kind of results in the command prompt

$ java weblogic.WLST Alert_StuckThread_ThreadDumps.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Connecting to t3://localhost:8004 with userid weblogic ...
Successfully connected to managed Server 'MS-3' that belongs to domain 'Domain_8001'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)

Execute Threads :  5
Hogger Thread Count :  2
executeThread_Vs_HoggerThreadRatio : 2
Ratio :  2

 !!!! ALERT !!!! Stuck Threads are on its way.....

Taking Thread Dump :  0
Thread dump for the running server: MS-3
"[STANDBY] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'" waiting for lock weblogic.work.ExecuteThread@1e2ba602 WAITING
	java.lang.Object.wait(Native Method)
	java.lang.Object.wait(Object.java:485)
	weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:157)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:178)

"[STANDBY] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)'" waiting for lock weblogic.work.ExecuteThread@439fdcc7 WAITING
	java.lang.Object.wait(Native Method)
	java.lang.Object.wait(Object.java:485)
	weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:157)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:178)

"DynamicListenThread[Default[2]]" RUNNABLE native
	java.net.PlainSocketImpl.socketAccept(Native Method)
	java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
	java.net.ServerSocket.implAccept(ServerSocket.java:453)
	java.net.ServerSocket.accept(ServerSocket.java:421)
	weblogic.socket.WeblogicServerSocket.accept(WeblogicServerSocket.java:38)
	weblogic.server.channels.DynamicListenThread$SocketAccepter.accept(DynamicListenThread.java:523)
	weblogic.server.channels.DynamicListenThread$SocketAccepter.access$200(DynamicListenThread.java:415)
	weblogic.server.channels.DynamicListenThread.run(DynamicListenThread.java:166)
	java.lang.Thread.run(Thread.java:619)

"[STUCK] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'" RUNNABLE
	sun.misc.FloatingDecimal.doubleValue(FloatingDecimal.java:1531)
	java.lang.Double.parseDouble(Double.java:510)
	jsp_servlet.__index._jspService(__index.java:71)
	weblogic.servlet.jsp.JspBase.service(JspBase.java:34)
	weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
	weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
	weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292)
	weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175)
	weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3498)
	weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
	weblogic.security.service.SecurityManager.runAs(Unknown Source)
	weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2180)
	weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2086)
	weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1406)
	weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:173)

NOTE: This script is using mailx (i.e. but Windows box does not have mailx utility) so please do check if your mailx is configured properly or else script would run properly but the mail would not be sent.

Alert Email

Alert Email

.

Regards,

Ravish Mody