Tag: Hogger Thread

Sending Email Alert For Stuck Threads With Thread Dumps

Ravish Mody

The most basic need for almost every 24×7 production environment to keep on monitoring the Server Health and the activities of Threads. And the very common need is to collect the Thread Dump  as soon as [STUCK] Thread occurs in any of the WebLogic Server. Most of the time we do post-mortem analysis , Means when the Stuck Thread issues occurred or WebLogic Server  hanged we could not collect the Thread Dumps to investigate. The Support always has this excuse.

While being in Middleware Support many times we face the same problem that the customers could not collect the thread dumps during the issue was occurring, which always delayed the resolution time and to find out the root cause of the actual issue.

To avoid above kind of issues we developed an Automatic WLST Script which has the following features in it:

Features Of This  Script:

  1. Ready To Use: The Script is ready to use, it means you need not to edit anything in the WLST script except the email address in line-32.
  2. Flexibility: You need to just change the values present in “domains.properties” file like how many Thread Dumps you want to collect when the issue occurs.
  3. E-Mail Alert: The Administrator will get to know regarding the issue via an E-Mail alert immediately.
  4. Thread Dumps In Mail: The complete Thread Dumps will we sent to the Administrator via the E-Mail so need not to worry about collecting the Thread Dumps.
  5. Independent Script: This WLST script can run independently without the help of any Cron-Job utility provided by the operating System (But it can be associated with the Cron-Job utility as well) So it provides more flexibility to the Administrators.

Steps to Create an Email Alert For Stuck Threads With Thread Dumps

Step1) Create a Directory somewhere in your file system like : “C:WLST”

Step2) Write a Properties file “domains.properties” inside “C:WLST” like following:

server.url=t3://localhost:8004
admin.username=weblogic
admin.password=weblogic
monitoring.server.name=MS-3

# This ExecuteThread_Vs_HoggerThreadRatio represtents the division of ExecuteThread/HoggerThreadRatio
ExecuteThread_Vs_HoggerThreadRatio=2

# Number of times the RATIO has to be checked
checkTimes_Number=3

# TIME INTERVAL between number of times the RATIO has to be checked (60000 milliseconds = 60 seconds)
checkInterval_in_Milliseconds=60000

# Number of times the Thread Dump has to be taken
threadDumpTimes_Number=5

# TIME INTERVAL between each thread number has to be taken (10000 milliseconds = 10 seconds)
threadDumpInterval_in_Milliseconds=10000

# Number of times to send thread dumps in mail is case of stuck/hogging thread issue
sendEmail_ThreadDump_Counter=2

Step-3) Create a WLST Script somewhere in your file system with some name like “Alert_StuckThread_ThreadDumps.py” inside “C:WLST” contents will be something like following:

#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################

from java.io import FileInputStream
import java.lang
import os

propInputStream = FileInputStream("domains.properties")
configProps = Properties()
configProps.load(propInputStream)

adminUrl = configProps.get("server.url")
adminUser = configProps.get("admin.username")
adminPassword = configProps.get("admin.password")
monitoringServerName = configProps.get("monitoring.server.name")

executeThread_Vs_HoggerThreadRatio = configProps.get("ExecuteThread_Vs_HoggerThreadRatio")
checkTimes_Number = configProps.get("checkTimes_Number")
checkInterval_in_Milliseconds = configProps.get("checkInterval_in_Milliseconds")
threadDumpTimes_Number = configProps.get("threadDumpTimes_Number")
threadDumpInterval_in_Milliseconds = configProps.get("threadDumpInterval_in_Milliseconds")
sendEmail_ThreadDump_Counter = configProps.get("sendEmail_ThreadDump_Counter")

i = 0
y = int(checkTimes_Number)

#############  This method would send the Alert Email with Thread Dump  #################
def sendMailThreadDump():
	os.system('/bin/mailx -s  "ALERT: CHECK Thread Dumps as Hogger Thread Count Exceeded the Limt !!! " abcd@company.com < All_ThreadDump.txt')
	print '*********  ALERT MAIL HAS BEEN SENT  ***********'
	print ''

#############  This method is checking the Hogger Threads Ratio  #################
def alertHoggerThreads(executeTTC , hoggerTC):
	print 'Execute Threads : ', executeTTC
	print 'Hogger Thread Count : ', hoggerTC
	print 'executeThread_Vs_HoggerThreadRatio :', executeThread_Vs_HoggerThreadRatio
	if hoggerTC != 0:
		ratio=(executeTTC/hoggerTC)
		print 'Ratio : ' , ratio
		print ''
		if (int(ratio) <= int(executeThread_Vs_HoggerThreadRatio)):
			print ' !!!! ALERT !!!! Stuck Threads are on its way.....'
			print ''
			message =  'ExecuteThreads Count= ' + str(executeTTC) + '   HoggingThreads= '+ str(hoggerTC) +'   ExecuteThreads/HoggingThreads Ratio= '+ str(ratio)
			cmd = "echo " + message +" > rw_file"
			os.system(cmd)
			genrateThreadDump()
		else:
			print '++++++++++++++++++++++++++++++++++++'
			print 'Everything is working fine till now'
			print '++++++++++++++++++++++++++++++++++++'
	else:
		print '++++++++++++++++++++++++++++++++++++'
		print 'Everything is working fine till now'
		print '++++++++++++++++++++++++++++++++++++'

#############  This method is Taking the Thread Dumps #################
def genrateThreadDump():
	b = int(sendEmail_ThreadDump_Counter)
	a = 0
	p = 0
	q = int(threadDumpTimes_Number)
	serverConfig()
	cd ('Servers/'+ monitoringServerName)
	while (p < q):
		if a < b:
			print 'Taking Thread Dump : ', p
			threadDump()
			cmd = "cat Thread_Dump_MS-3.txt >> All_ThreadDump.txt"
			os.system(cmd)
			print 'Thread Dump Collected : ', p ,' now Sleeping for ', int(threadDumpInterval_in_Milliseconds) , ' Seconds ...'
			print ''
			Thread.sleep(int(checkInterval_in_Milliseconds))
			b = b - 1
			p = p + 1
	sendMailThreadDump()
	cmd = "rm -f All_ThreadDump.txt"
	os.system(cmd)
	serverRuntime()

connect(adminUser,adminPassword,adminUrl)
serverRuntime()
cd('ThreadPoolRuntime/ThreadPoolRuntime')

while (i < y):
	executeTTC=cmo.getExecuteThreadTotalCount();
	hoggerTC=cmo.getHoggingThreadCount();
	alertHoggerThreads(executeTTC , hoggerTC)
	print 'Sleeping for ', int(checkInterval_in_Milliseconds) , ' ...'
	print ''
	Thread.sleep(int(checkInterval_in_Milliseconds))
	i = i + 1

Step-4) Open a command prompt and then run the “setWLSEnv.cmd” or “setWLSEnv.sh” to set the CLASSPATH and PATH variables. Better you do echo %CLASSPATH% or echo $CLASSPATH to see whether the CLASSPATH is set properly or not. If you see an Empty Classpath even after running the “setWLSEnv.sh” then please refer to the Note mentioned at Step3) in the Following post: http://middlewaremagic.com/weblogic/?page_id=1492

Step-5) Now run the WLST Script in the same command prompt using the following command:

java   weblogic.WLST  Alert_StuckThread_ThreadDumps.py

You will see the following kind of results in the command prompt

$ java weblogic.WLST Alert_StuckThread_ThreadDumps.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Connecting to t3://localhost:8004 with userid weblogic ...
Successfully connected to managed Server 'MS-3' that belongs to domain 'Domain_8001'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)

Execute Threads :  5
Hogger Thread Count :  2
executeThread_Vs_HoggerThreadRatio : 2
Ratio :  2

 !!!! ALERT !!!! Stuck Threads are on its way.....

Taking Thread Dump :  0
Thread dump for the running server: MS-3
"[STANDBY] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'" waiting for lock weblogic.work.ExecuteThread@1e2ba602 WAITING
	java.lang.Object.wait(Native Method)
	java.lang.Object.wait(Object.java:485)
	weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:157)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:178)

"[STANDBY] ExecuteThread: '3' for queue: 'weblogic.kernel.Default (self-tuning)'" waiting for lock weblogic.work.ExecuteThread@439fdcc7 WAITING
	java.lang.Object.wait(Native Method)
	java.lang.Object.wait(Object.java:485)
	weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:157)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:178)

"DynamicListenThread[Default[2]]" RUNNABLE native
	java.net.PlainSocketImpl.socketAccept(Native Method)
	java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
	java.net.ServerSocket.implAccept(ServerSocket.java:453)
	java.net.ServerSocket.accept(ServerSocket.java:421)
	weblogic.socket.WeblogicServerSocket.accept(WeblogicServerSocket.java:38)
	weblogic.server.channels.DynamicListenThread$SocketAccepter.accept(DynamicListenThread.java:523)
	weblogic.server.channels.DynamicListenThread$SocketAccepter.access$200(DynamicListenThread.java:415)
	weblogic.server.channels.DynamicListenThread.run(DynamicListenThread.java:166)
	java.lang.Thread.run(Thread.java:619)

"[STUCK] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'" RUNNABLE
	sun.misc.FloatingDecimal.doubleValue(FloatingDecimal.java:1531)
	java.lang.Double.parseDouble(Double.java:510)
	jsp_servlet.__index._jspService(__index.java:71)
	weblogic.servlet.jsp.JspBase.service(JspBase.java:34)
	weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
	weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
	weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292)
	weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175)
	weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3498)
	weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
	weblogic.security.service.SecurityManager.runAs(Unknown Source)
	weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2180)
	weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2086)
	weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1406)
	weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
	weblogic.work.ExecuteThread.run(ExecuteThread.java:173)

NOTE: This script is using mailx (i.e. but Windows box does not have mailx utility) so please do check if your mailx is configured properly or else script would run properly but the mail would not be sent.

Alert Email

Alert Email

.

Regards,

Ravish Mody


Sending Email Alert for Hogger Threads Count Using WLST

Ravish Mody

This post is written considering to send an email alert message to Admin’s which would alert them about the hogger threads. Hogging threads can be called as a candidates for stuck threads in other words, those threads that “might” get stuck are called hogging threads. These threads will be declared as stuck threads after “StuckThreadMaxTimeout” seconds which by default value is 600secs.

Most of you guys might agree with me, that no one wants stuck threads in there production environment. Hence this script would surely help you guys to take an relevant actions once you get an alert about the hogging threads have exceeded the given ration.
In below  script we have used a properties file in which you can give all the details about the domain as well as hogger thread ration, time interval and the number of times the ratio has to be checked.

Steps to Create an Email Alert for Hogger Threads Count

Step1) Create a Directory somewhere in your file system like :  “C:WLST”

Step2) Write a Properties file “domains.properties” inside “C:WLST” like following:

admin.url=t3://localhost:7001
admin.username=weblogic
admin.password=weblogic

# This ExecuteThread_Vs_HoggerThreadRatio represtents the division of ExecuteThread/HoggerThreadRatio
ExecuteThread_Vs_HoggerThreadRatio=2

# Number of times the RATIO has to be checked
checkTimes_Number=25

# TIME INTERVAL between number of times the RATIO has to be checked (30000 milliseconds = 30 seconds)
checkInterval_in_Milliseconds=30000

############ Accouding to the above values the checker will run for total 25 times in an interval of 30 seconds each. #############

Step-3) Create a WLST Script somewhere in your file system with some name like “Alert_HoggerThreadCoung.py” inside “C:WLST” contents will be something like following:

UPDATED [26-04-2011]: This script has been updated based on the comment — >  http://middlewaremagic.com/weblogic/?p=5423#comment-3760, thanks to sathya.

#############################################################################
#
# @author Copyright (c) 2010 - 2011 by Middleware Magic, All Rights Reserved.
#
#############################################################################

from java.io import FileInputStream
import java.lang
import os

propInputStream = FileInputStream("domains.properties")
configProps = Properties()
configProps.load(propInputStream)

adminUrl = configProps.get("admin.url")
adminUser = configProps.get("admin.username")
adminPassword = configProps.get("admin.password")
executeThread_Vs_HoggerThreadRatio = configProps.get("ExecuteThread_Vs_HoggerThreadRatio")
checkTimes_Number = configProps.get("checkTimes_Number")
checkInterval_in_Milliseconds = configProps.get("checkInterval_in_Milliseconds")

i = 0
y = int(checkTimes_Number)

#############  This method would send the Alert Email  #################
def sendMailString():
	os.system('/bin/mailx -s  "ALERT: Hogger Thread Count Exceeded the Limt !!! " abcd@company.com < rw_file')
	print '*********  ALERT MAIL HAS BEEN SENT  ***********'
	print ''

#############  This method is checking the Hogger Threads Ratio  #################
def alertHoggerThreads(executeTTC , hoggerTC):
	print 'Execute Threads : ', executeTTC
	print 'Hogger Thread Count : ', hoggerTC
	if hoggerTC != 0:
		ratio=(executeTTC/hoggerTC)
		print 'Ratio : ' , ratio
		print ''
		if (int(ratio) <= int(executeThread_Vs_HoggerThreadRatio)):
			print ' !!!! ALERT !!!! Stuck Threads are on its way.....'
			print ''
			message =  'ExecuteThreads Count= ' + str(executeTTC) + '   HoggingThreads= '+ str(hoggerTC) +'   ExecuteThreads/HoggingThreads Ratio= '+ str(ratio)
			cmd = "echo " + message +" > rw_file"
			os.system(cmd)
			sendMailString()
		else:
			print '++++++++++++++++++++++++++++++++++++'
			print 'Everything is working fine till now'
			print '++++++++++++++++++++++++++++++++++++'
	else:
			print '++++++++++++++++++++++++++++++++++++'
			print 'Everything is working fine till now'
			print '++++++++++++++++++++++++++++++++++++'

connect(adminUser,adminPassword,adminUrl)
serverRuntime()
cd('ThreadPoolRuntime/ThreadPoolRuntime')

while (i < y):
	ls()
	executeTTC=cmo.getExecuteThreadTotalCount();
	hoggerTC=cmo.getHoggingThreadCount();
	alertHoggerThreads(executeTTC , hoggerTC)
	print 'Sleeping for ', int(checkInterval_in_Milliseconds) , ' ...'
	print ''
	Thread.sleep(int(checkInterval_in_Milliseconds))
	i = i + 1

Step-4) Open a command prompt and then run the “setWLSEnv.cmd” or “setWLSEnv.sh” to set the CLASSPATH and PATH variables. Better you do echo %CLASSPATH% or echo $CLASSPATH to see whether the CLASSPATH is set properly or not. If you see an Empty Classpath even after running the “setWLSEnv.sh” then please refer to the Note mentioned at Step3) in the Following post: http://middlewaremagic.com/weblogic/?page_id=1492

Step-5) Now run the WLST Script in the same command prompt using the following command:

java   weblogic.WLST  Alert_HoggerThreadCoung.py

You will see the following kind of results in the command prompt

Initializing WebLogic Scripting Tool (WLST) ...
]$ java weblogic.WLST Alert_HoggerThreadCoung.py

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Connecting to t3://localhost:7001 with userid weblogic ...
Successfully connected to Admin Server 'AdminServer' that belongs to domain 'Domain_7001'.

Warning: An insecure protocol was used to connect to the
server. To ensure on-the-wire security, the SSL port or
Admin port should be used instead.

Location changed to serverRuntime tree. This is a read-only tree with ServerRuntimeMBean as the root.
For more help, use help(serverRuntime)

-r--   CompletedRequestCount                        2606
-r--   ExecuteThreadIdleCount                       1
-r--   ExecuteThreadTotalCount                      6
-r--   ExecuteThreads                               weblogic.work.ExecuteThreadRuntime[weblogic.work.ExecuteThreadRuntime@78de59f8, weblogic.work.ExecuteThreadRuntime@4de4e6c6, weblogic.work.ExecuteThreadRuntime@6eeaf91d, weblogic.work.ExecuteThreadRuntime@48917cf, weblogic.work.ExecuteThreadRuntime@447a195c, weblogic.work.ExecuteThreadRuntime@2c170a23]
-r--   HealthState                                  Component:threadpool,State:HEALTH_OK,MBean:ThreadPoolRuntime,ReasonCode:[]
-r--   HoggingThreadCount                           0
-r--   MinThreadsConstraintsCompleted               104
-r--   MinThreadsConstraintsPending                 0
-r--   Name                                         ThreadPoolRuntime
-r--   PendingUserRequestCount                      0
-r--   QueueLength                                  0
-r--   SharedCapacityForWorkManagers                65536
-r--   StandbyThreadCount                           4
-r--   Suspended                                    false
-r--   Throughput                                   5.0
-r--   Type                                         ThreadPoolRuntime

-r-x   preDeregister                                Void :

Execute Threads :  6
Hogger Thread Count :  0
++++++++++++++++++++++++++++++++++++
Everything is working fine till now
++++++++++++++++++++++++++++++++++++
Sleeping for  30000  ...

NOTE: This script is using mailx (i.e. but Windows box does not have mailx utility) so please do check if your mailx is configured properly or else script would run properly but the mail would not be sent.

Alert Email

Alert Email

.

Regards,

Ravish Mody


Copyright © 2010-2012 Middleware Magic. All rights reserved. |