This post is inspired by a discussion on OTN, in the sense that it is not always straightforward to set-up a high-available environment in which failures are automatically handled. When processes are automated to deal with certain kinds of failures, it means the system has to take steps to recover from a certain failure. For example, session failover in a cluster when a server goes down is an easy to achieve mechanism. But what if we have a service running that is targeted to a single server, a so-called singleton service. How do we make sure this service keeps running when a WebLogic instance fails? In this post, we address failure scenarios that involve JMS and JTA and the mechanisms that WebLogic provides for recovering from these failures.

Service migration

Most services are deployed homogeneously on all server instances in a cluster. In contrast, singleton services such as a JMS server and the JTA transaction recovery system are targeted to one server. When the server to which these services are targeted fails, we need to migrate the services to another server. More information on the service migration framework can be found here.

Two migration types are possible: manual and automatic. To set up manual migration, we need to perform the following steps:

  • Create managed servers.
  • Create a cluster and add the created managed servers to the cluster. Note that WebLogic automatically creates migratable targets (a special target that can migrate from one server in a cluster to another) for clustered managed servers.
  • Create machines and add the managed servers to the machines.
  • Configure migration:
    • Click Environment and then Migratable Targets.
    • Select the migratable target for which we want to configure migration and click the Migration Configuration tab.
    • Set the Service Migration Policy to Manual Service Migration Only.
    • Select the User-Preferred Server, i.e, the server to host the service.
    • Specify Constrained Candidate Servers that can host the service should the user-preferred server fail.
  • Create a persistent store and target it to the migratable target for which we have configured migration. Note that the persistent store will be used to store persistent JMS messages. The persistent store must be accessible from all candidate server members in the migratable target. This means that when using a file store and we have a cluster that spans multiple machines, we have to resort to a shared storage solution. In the case of a JDBC store, we have to make sure the data source is available to all candidate servers.
  • Create a JMS server and target it to the same migratable target as the persistent store. When a JMS Server is targeted to a migratable target it can not use the default file store, we must use a custom persistent store.
  • Restart the affected servers. In the change center (top left), click the View changes and restarts link. Click on the Restart Checklist tab and restart the servers listed in the table.
  • To migrate a JMS service manually, use the migratable targets’ Control tab.

When we are using JTA transactions in the JMS application, the JTA service must be migrated as well. The JTA service plays a critical role in recovery from a failure scenario where transactions are involved. In-flight transactions can hold locks on underlying resources. If the transaction manager is not available to recover these transactions, resources may hold on to these locks for a long time. This makes it difficult for an application to function properly. JTA service migration is possible only if the server’s default persistent store (where the JTA logs are kept) is accessible to the server to which the service will migrate. Some care must be taken as how to share these files. Distributed file systems such as NFS typically do not provide the necessary semantics to guarantee the integrity and content of transaction logs. Typically, this means we must use some high-end means for sharing the files, such as a Storage Area Network (SAN).

To migrate the JTA service, we must ensure that the default persistent store is accessible from all candidate servers in the cluster. By default, the default persistent store is located in the ${DOMAIN_HOME}/servers/<server-name>/data/store/default directory. Note when the cluster spans multiple machines we have to resort to a shared storage. Recommendations concerning directory structures and set-up can be found here. To change the location of the default persistent store, click on the Services Configuration tab of a specific managed server and set the Directory attribute to the absolute path of the directory on the (shared) file system. Steps to manually migrate the JTA service are presented here.

Setting up automatic service migration requires almost the same steps as setting up manual migration. Before we present the steps required for automatic service migration, we need to talk about leasing and automatic service migration policies.

Leasing is the process WebLogic uses to manage services that are required to run on only one server in the cluster at a time. Two leasing mechanisms are supported:

  • Database-Based Leasing – This style of leasing relies on a highly available database to coordinate the actions of the servers in the cluster. It is important that we ensure that the database is always available and reachable by each migratable server. The migratable server is only as reliable as the database is. If a migratable server is unable to reach the database, it will shut itself down. To configure database leasing we have to follow the following steps:
    • Create the leasing table (leasing information is maintained in a database table). The schema definition for the table is located in a database vendor specific directory underneath the ${WL_HOME}/server/db directory in a file called leasing.ddl.
    • Configure a non-XA data source to access the leasing information.
    • Use the cluster’s Migration Configuration tab, set the Migration Basis to Database and set the Data Source For Automatic Migration attribute to point to the leasing data source.
    • Additionally, change the Auto Migration Table Name attribute if the leasing table is named something other than ACTIVE, which is the default value.
    • The node manager is required if pre migration and post migration scripts are defined.
  • Consensus-Based Leasing – This style of leasing keeps the leasing table in-memory. One server in the cluster is designated as the cluster leader. The cluster leader controls leasing in that it holds a copy of the leasing table in-memory and other servers in the cluster communicate with the cluster leader to determine lease information. The leasing table is replicated across the cluster to ensure high availability. To configure consensus-based leasing we have to follow the following steps:
    • Use the cluster’s Migration Configuration tab and set the Migration Basis to Consensus.
    • Consensus-based leasing requires a node manager on every machine hosting managed servers within the cluster. The node manager is required to get health monitoring information about the involved servers.

The migratable target performs health monitoring on the deployed migratable services and has a direct communication channel to the leasing system. When bad health is detected the migratable target requests the lease to be released in order to trigger a migration:

  • In the case of JTA, the server defaults to shutting down if the JTA system reports itself unhealthy, for example, if an I/O error occurs when accessing the default store. When a server fails, JTA is migrated to a candidate server.
  • In the case of JMS, the JMS server communicates its health to the monitoring system. When a dependent service such as a persistent store fails, for example due to errors in the I/O layer, it is detected by the migration framework. In this case the JMS server along with the persistent store (and path service when configured) is migrated to a candidate server.

We can choose two automatic service migration policies:

  • Auto-Migrate Exactly-Once Services – the service will run if at least one candidate server is available in the cluster. This can lead to the case that all migratable targets are running on a single server.
  • Auto-Migrate Failure Recovery Services – the service will only start if the user preferred server is started. If the server is shutdown by an administrator the service will not be migrated. If the user preferred server fails due to an internal error, the service will be migrated to a candidate server.

To configure JMS server migration that only contain distributed destinations, we can select the Auto-Migrate Failure Recovery Services option. This means that if we shut a server down, we will need to manually migrate the service before server shutdown; otherwise, the service will be unavailable. If, on the other hand, a JMS destination is not part of a distributed destination and the application depends on accessing the destinations, we must select the Auto-Migrate Exactly-Once Services option to ensure that the destinations are made available as quickly as possible to prevent the application from failing.

To configure automatic service migration we can follow the following steps:

  • Create managed servers.
  • Create a cluster and add the created managed servers to the cluster.
  • Create machines and add the managed servers to the machines.
  • Configure leasing: Use the cluster’s Migration Configuration tab and set the Migration Basis to either Database or Consensus.
    • In the case of database leasing:
      • Create the leasing table.
      • Create a data source to access the leasing table.
      • On the cluster’s Migration Configuration tab, set the Data Source for Auto Migration attribute to the data source
      • Additionally edit the Auto Migration Table Name attribute and set it to the name of the leasing table.
    • In the case of consensus leasing:
      • Make sure that that the node manager is configured on each machine that hosts managed servers in the cluster.
  • Configure migration:
    • Click Environment and then Migratable Targets.
    • Select the migratable target for which we want to configure migration and click the Migration Configuration tab.
    • Set the Service Migration Policy to either Auto-Migrate Exactly-Once Services or Auto-Migrate Failure Recovery Services.
    • Select the User-Preferred Server, i.e., the server to host the service.
    • Specify Constrained Candidate Servers that can host the service should the user-preferred server fail.
  • Create a persistent store and target it to the migratable target for which we have configured migration. Make sure that all candidate servers in the migratable target have access to the persistent store.
  • Create a JMS server and target it to the same migratable target as the persistent store.
  • Restart the affected servers.

To add automatic JTA service migration to the cluster already configured to support automatic JMS service migration, we must perform the following steps:

  • On each server’s Migration Configuration tab, enable Automatic JTA Migration Enabled.
  • Specify JTA Candidate Servers that can host the service, i.e., that can access the default persistent store for the current server. To migrate the JTA service from a failed server in a cluster to another server in the same cluster, the backup server must have access to the transaction log files from the failed server. WebLogic uses the default persistent store to store transaction log files. Thus when the cluster spans multiple machines, we must configure the default persistent store to store its data files in a file system on a shareable storage solution, such as a Storage Area Network (SAN) device, that is available to both the primary server and the server that will act as its backup.
  • Restart the affected servers.

Test

In the test we use the following software versions:
- jrrt-4.0.1-1.6.0-linux-x64.bin
- wls1035_generic.jar

To set-up the test environment we follow the following steps:

  • Create a new domain.
  • Start the admin server.
  • Run the script script.py (presented below), by following these steps:
    • Set the WebLogic Server environment (run setWLSEnv.sh).
    • Run java weblogic.WLST.
    • Execute the script by using execfile('location_of_the_script/script.py').
  • Check in the admin console if the changes are satisfactory.
  • Stop the admin server as we are going to use WLST to start-up the whole environment.

The script script.py looks as follows:

beahome = '/home/oracle/weblogic';
pathseparator = '/';

adminusername='weblogic';
adminpassword='magic11g';

jvmlocation = beahome + pathseparator + 'jrrt-4.0.1-1.6.0';

print 'CONNECT TO ADMIN SERVER';
connect(adminusername, adminpassword);

print 'START EDIT MODE';
edit();
startEdit();

print 'CREATE MACHINE';
machine = cmo.createMachine('TestMachine');
machine.getNodeManager().setNMType('ssl');

print 'CREATE CLUSTER';
cluster = cmo.createCluster('TestCluster');
cluster.setClusterMessagingMode('unicast');

print 'CREATE MANAGED SERVER: testSERVER1';
testServer1 = cmo.createServer('TestServer1');
testServer1.setListenPort(7002);
testServer1.setAutoRestart(true);
testServer1.setAutoKillIfFailed(true);
testServer1.setRestartMax(2);
testServer1.setRestartDelaySeconds(10);
testServer1.getServerStart().setJavaHome(jvmlocation);
testServer1.getServerStart().setJavaVendor('Oracle');
testServer1.getServerStart().setArguments('-jrockit -Xms1024m -Xmx1024m -Xns256m -Xgcprio:pausetime -XpauseTarget:200ms');

print 'CREATE MANAGED SERVER: testSERVER2';
testServer2 = cmo.createServer('TestServer2');
testServer2.setListenPort(7003);
testServer2.setAutoRestart(true);
testServer2.setAutoKillIfFailed(true);
testServer2.setRestartMax(2);
testServer2.setRestartDelaySeconds(10);
testServer2.getServerStart().setJavaHome(jvmlocation);
testServer2.getServerStart().setJavaVendor('Oracle');
testServer2.getServerStart().setArguments('-jrockit -Xms1024m -Xmx1024m -Xns256m -Xgcprio:pausetime -XpauseTarget:200ms');

print 'ADD MANAGED SERVERS TO CLUSTER';
testServer1.setCluster(cluster);
testServer2.setCluster(cluster);

print 'ADD MANAGED SERVERS TO MACHINE';
testServer1.setMachine(machine);
testServer2.setMachine(machine);

print 'CONFIGURE MIGRATION SERVICE';
cluster.setMigrationBasis('consensus')
cluster.setAdditionalAutoMigrationAttempts(3)
cluster.setMillisToSleepBetweenAutoMigrationAttempts(180000)
cluster.getDatabaseLessLeasingBasis().setMemberDiscoveryTimeout(30);
cluster.getDatabaseLessLeasingBasis().setLeaderHeartbeatPeriod(10);
candidatemachines = cluster.getCandidateMachinesForMigratableServers();
candidatemachines.append(machine);
cd('/Clusters/TestCluster');
set('CandidateMachinesForMigratableServers',candidatemachines);
cd('/');

print 'CONFIGURE MIGRATABLE TARGETS';
migratabletargets = cmo.getMigratableTargets();
for migratabletarget in migratabletargets:
	migratabletarget.setMigrationPolicy('exactly-once');
	cd('/MigratableTargets/' + migratabletarget.getName());
	set('ConstrainedCandidateServers',jarray.array([ObjectName('com.bea:Name=TestServer1,Type=Server'), ObjectName('com.bea:Name=TestServer2,Type=Server')], ObjectName))
	cd('/');

migratabletargetserver = migratabletargets[0];

print 'CREATE FILESTORE';
filestore = cmo.createFileStore('FileStore');
filestore.setDirectory('/home/oracle/bea/deploy');
targets = filestore.getTargets();
targets.append(migratabletargetserver);
filestore.setTargets(targets);

print 'CREATE JMS SERVER';
jmsserver = cmo.createJMSServer('JMSServer');
jmsserver.setPersistentStore(filestore);
jmsserver.setTargets(targets);

print 'CREATE PATH SERVICE';
pathservice = cmo.createPathService('PathService');
pathservice.setPersistentStore(filestore);
pathservice.setTargets(targets);

targets.remove(migratabletargetserver);
targets.append(cluster);

print 'CREATE JMS SYSTEM MODULE';
module = cmo.createJMSSystemResource('SystemModule');
module.setTargets(targets);

targets.remove(cluster);
targets.append(jmsserver);

print 'CREATE SUBDEPLOYMENT';
module.createSubDeployment('SubDeployment');
subdeployment = module.lookupSubDeployment('SubDeployment');
subdeployment.setTargets(targets);

resource = module.getJMSResource();

print 'CREATE CONNECTION FACTORY';
resource.createConnectionFactory('ConnectionFactory');
connectionfactory = resource.lookupConnectionFactory('ConnectionFactory');
connectionfactory.setJNDIName('jms/ConnectionFactory');
connectionfactory.setDefaultTargetingEnabled(true);
connectionfactory.getDefaultDeliveryParams().setDefaultUnitOfOrder('.System');
connectionfactory.getTransactionParams().setTransactionTimeout(3600);
connectionfactory.getTransactionParams().setXAConnectionFactoryEnabled(true);

print 'CREATE UNIFORM DISTRIBUTED QUEUE';
resource.createUniformDistributedQueue('DistributedQueue');
distributedqueue = resource.lookupUniformDistributedQueue('DistributedQueue');
distributedqueue.setJNDIName('jms/CompanyQueue');
distributedqueue.setLoadBalancingPolicy('Round-Robin');
distributedqueue.setSubDeploymentName('SubDeployment');
distributedqueue.setUnitOfOrderRouting('PathService');

targets.remove(jmsserver);
targets.append(cluster);

print 'CREATE DATA SOURCE';
datasource = cmo.createJDBCSystemResource('DataSource');
datasource.setTargets(targets);
jdbcResource = datasource.getJDBCResource();
jdbcResource.setName('DataSource');
names = ['jdbc/exampleDS'];
dataSourceParams = jdbcResource.getJDBCDataSourceParams();
dataSourceParams.setJNDINames(names);
dataSourceParams.setGlobalTransactionsProtocol('LoggingLastResource');
driverParams = jdbcResource.getJDBCDriverParams();
driverParams.setUrl('jdbc:oracle:thin:@hostname:1521:SID');
driverParams.setDriverName('oracle.jdbc.OracleDriver');
driverParams.setPassword('password');
driverProperties = driverParams.getProperties();
driverProperties.createProperty('user');
userProperty = driverProperties.lookupProperty('user');
userProperty.setValue('username');
connectionPoolParams = jdbcResource.getJDBCConnectionPoolParams();
connectionPoolParams.setTestTableName('SQL SELECT 1 FROM DUAL');
connectionPoolParams.setConnectionCreationRetryFrequencySeconds(100);

print 'SAVE AND ACTIVATE CHANGES';
save();
activate(block='true');

print 'START EDIT MODE';
edit();
startEdit();

print 'CONFIGURE AUTOMATIC JTA MIGRATION';
cd('/Servers/TestServer1/JTAMigratableTarget/TestServer1');
set('ConstrainedCandidateServers',jarray.array([ObjectName('com.bea:Name=TestServer1,Type=Server'), ObjectName('com.bea:Name=TestServer2,Type=Server')], ObjectName));
cmo.setMigrationPolicy('failure-recovery');
cd('/');
cd('/Servers/TestServer2/JTAMigratableTarget/TestServer2');
set('ConstrainedCandidateServers',jarray.array([ObjectName('com.bea:Name=TestServer1,Type=Server'), ObjectName('com.bea:Name=TestServer2,Type=Server')], ObjectName));
cmo.setMigrationPolicy('failure-recovery');
cd('/');

print 'SAVE AND ACTIVATE CHANGES';
save();
activate(block='true');

To start-up the environment we use the following script:

beahome = '/home/oracle/weblogic';
pathseparator = '/';

adminusername = 'weblogic';
adminpassword = 'magic11g';
domainname = 'base_domain';

domainlocation = beahome + pathseparator + 'user_projects' + pathseparator + 'domains' + pathseparator + domainname;
nodemanagerhomelocation = beahome + pathseparator + 'wlserver_10.3' + pathseparator + 'common' + pathseparator + 'nodemanager';

print 'START NODE MANAGER';
startNodeManager(verbose='true', NodeManagerHome=nodemanagerhomelocation, ListenPort='5556', ListenAddress='localhost');

print 'CONNECT TO NODE MANAGER';
nmConnect(adminusername, adminpassword, 'localhost', '5556', domainname, domainlocation, 'ssl');

print 'START ADMIN SERVER';
nmStart('AdminServer');

print 'CONNECT TO ADMIN SERVER';
connect(adminusername, adminpassword);

print 'START CLUSTER';
start('TestCluster','Cluster');

To stop the environment we can use the following script:

beahome = '/home/oracle/weblogic';
pathseparator = '/';

adminusername = 'weblogic';
adminpassword = 'magic11g';
domainname = 'base_domain';

domainlocation = beahome + pathseparator + 'user_projects' + pathseparator + 'domains' + pathseparator + domainname;

print 'CONNECT TO NODE MANAGER';
nmConnect(adminusername, adminpassword, 'localhost', '5556', domainname, domainlocation, 'ssl');

print 'CONNECT TO ADMIN SERVER';
connect('weblogic','transfer11g');

print 'STOPPING CLUSTER';
shutdown('TestCluster','Cluster','true',1000,'true');

print 'STOPPING ADMIN SERVER';
shutdown('AdminServer','Server','true',1000,'true');

print 'STOPPING NODE MANAGER';
stopNodeManager();

A remark is in order. When using a uniform distributed queue WebLogic creates the necessary members on the JMS servers to which the uniform distributed queue is targeted. In our case the uniform distributed queue is targeted only to one JMS server and thus in order to be highly available the JMS server needs to be migrated to the other server. Normally, we would set up a JMS server on every managed server. When we do this it is also possible to use auto-migrate failure recovery servers as the automatic service migration policy, i.e., when one managed server fails the JMS environment will continue to function without the service because other members are still available.

The application that we are going to use for testing, was presented in the Test application section of the Fun with JRockit post. In order to load balance requests across the cluster, we will use the HTTP server. More information can be found in the Load Balancing section of the Setting-up Web-Tier Components in a SOA Environment post. To communicate with WebLogic Server we can use the Web Server plug-in mod_wl_ohs. To configure the mod_wl_ohs, open the mod_wl_ohs.conf file and add the following contents:

LoadModule weblogic_module   "${ORACLE_HOME}/ohs/modules/mod_wl_ohs.so"

NameVirtualHost 172.31.0.108:7777

<VirtualHost 172.31.0.108:7777>

	<IfModule weblogic_module>
		WebLogicCluster 172.31.0.108:7002,172.31.0.108:7003
		ConnectTimeoutSecs 10
		ConnectRetrySecs 2
		DebugConfigInfo ON
		WLSocketTimeoutSecs 2
		WLIOTimeoutSecs 300
		Idempotent ON
		FileCaching ON
		KeepAliveSecs 20
		KeepAliveEnabled ON
		DynamicServerList ON
		WLProxySSL OFF
	</IfModule>

	<Location /LoadTest>
		SetHandler weblogic-handler
	</Location>
</VirtualHost>

To get an overview of the configuration we can use the following URL: http://172.31.0.108:7777/LoadTest/?__WebLogicBridgeConfig.

We use the following Grinder script to create some requests:

from net.grinder.script.Grinder import grinder
from net.grinder.script import Test
from net.grinder.plugin.http import HTTPRequest

test1 = Test(1, "Request resource")
request1 = test1.wrap(HTTPRequest())

class TestRunner:
    def __call__(self):
        result = request1.GET("http://172.31.0.108:7777/LoadTest/testservlet")

During the test, use the admin console to shutdown a managed server to see the service migration in action. First check the current server of the JMS server, to this end click Services, Messaging, JMS Server and check the current server column. To shutdown the current server, click Environment, Servers. Click on the Control tab, select the current server, click Shutdown and choose Force Shutdown Now. Check the current server of the JMS server again to see if it is indeed migrated to the other managed server.

An alternative to service migration is to migrate the entire server instance to another machine, i.e., whole server migration.

Whole server migration

Service migration provides a great framework for ensuring availability of critical services during failure conditions. Note that service migration does not change the fact that one or more servers in the cluster failed and are not available to process requests. If the cluster is not redundant enough to handle such failures gracefully, applications could experience service level degradation until the failed managed servers are restarted. In these cases it is desirable to restart managed servers on another machine to limit service level degradation.

Before we configure whole server migration, we need to know the requirements:

  • The migratable server candidate machines have to be in the same subnet (because the virtual IP address must be valid on each candidate machine). Whole server migration uses a virtual IP address for each migratable server.
  • On each candidate machine, the node manager must be initialized with the security-related files it needs to authenticate and accept commands from the admin server.
  • The node manager is used to migrate the virtual (floating) IP address and assign it to the target machine. Note that the default configuration assumes that the machines are similar, i.e.,
    • The netmask associated with the virtual IP is the same on each candidate machine.
    • The network device (interface) name (for example, eth0 on Linux) is the same on each candidate machine.
    • The functional behavior of the platform-specific OS command used to add and remove the virtual IP (for example, ifconfig on Linux) is the same.
  • Migratable servers cannot define any network channels that use a Listen Address different from the virtual IP address associated with the server. If servers must use multiple network channels associated with multiple IP addresses, whole server migration cannot be used as only migration of a single virtual IP address for each migratable server is supported.
  • Server-specific state must be shared through some highly available sharing mechanism, i.e., the default persistent stores where the XA transaction logs are kept must be accessible on each candidate machine.

The following presents the steps needed to set up automatic whole server migration (another example is given here):

  • Create managed servers and assign a virtual IP address to each managed server that will have migration enabled, i.e, edit the Listen Address attribute.
  • Create a cluster and add the created managed servers to the cluster.
  • Create machines and add the managed servers to the machines.
  • Configure the node manager for each candidate machine:
    • When using the Java-based nodemanager edit the nodemanager.properties file (The nodemanager.properties file is created the first time the node manager is started in the ${WL_HOME}/common/nodemanager directory):
      • Set the NetMask property to the netmask associated with the virtual IP addresses being used, for example NetMask=255.255.255.0.
      • Set the Interface property to the network device name with which to associate the virtual IP address, for example Interface=eth0.
    • When using the SSH version of the node manager (that can be used on UNIX-type platforms only), edit the wlscontrol.sh script and set the NetMask and Interface properties (look for the Interface=${WLS_Interface:-""} and NetMask=${WLS_NetMask:-""} entries). The wlscontrol.sh script is located in the ${WL_HOME}/common/bin directory. The machines that host migratable targets must trust each other, i.e., it must be possible to get a shell prompt from a certain machine by using, for example, ssh 172.31.0.108 without a password.
  • Verify the domain and node manager configuration:
    • Start up the admin server (by using ${DOMAIN_HOME}/startWebLogic.sh) and the node managers (by using ${WL_HOME}/server/startNodeManager.sh) on each candidate machine. Use the admin console to start each clustered managed server. This ensures that the node managers and servers are properly configured and also initializes the node managers with the password files they need to accept commands from the admin server.
  • Configure candidate machines: Use the cluster’s Migration Configuration tab and select the Candidate Machines For Migratable Servers.
  • Configure leasing: Use the cluster’s Migration Configuration tab and set the Migration Basis to either Database or Consensus.
    • In the case of database leasing:
      • Create the leasing table.
      • Create a data source to access the leasing table.
      • On the cluster’s Migration Configuration tab, set the Data Source for Auto Migration attribute to the data source
      • Additionally edit the Auto Migration Table Name attribute and set it to the name of the leasing table.
    • In the case of consensus leasing:
      • Make sure that that the node manager is configured on each machine that hosts managed servers in the cluster.
  • Grant superuser privileges to the wlsifconfig.sh script, located in the ${WL_HOME}/common/bin directory, that is set up to use sudo by default.
    • Grant sudo privilege for the WebLogic user (for example oracle) with a no password restriction and grant privilege to the /sbin/ifconfig and /sbin/arping binaries.
    • Edit the /etc/sudoers file and the following entry: oracle ALL=NOPASSWD: /sbin/ifconfig,/sbin/arping.
  • Ensure that the PATH variable includes the following files so that the node managers can locate them:
    • wlsifconfig.sh – ${WL_HOME}/common/bin
    • wlscontrol.sh – ${WL_HOME}/common/bin
    • nodemanager.domains – ${WL_HOME}/common/nodemanager
  • Enable automatic server migration by using the managed server’s Migration Configuration tab and select the Automatic Server Migration Enabled checkbox.
  • Set the candidate machine for server migration by using the managed server’s Migration Configuration tab and select the Candidate Machines. Note that each managed server can have a different set of candidate machines.
  • Restart the affected servers.

Now that we have the whole server migration in place it needs to be tested.
Testing whole server migration is tricky so it is handy to add -Dweblogic.debug.DebugServerMigration=true to the Java command line to enable debugging.

References

[1] Patrick, et al., “Professional Oracle WebLogic Server”, Wiley Publishing, Inc., Indianapolis, Indiana, 2010. The first sentence in the introduction of this book goes like this “Professional Oracle WebLogic Server is different from other books about WebLogic Server…”. This book has a good learning value for people who want to understand not just how things can be done, but also want to know the why behind the things that they have done.
[2] Service Migration.
[3] Whole Server Migration.
[4] Configure migratable targets for JMS-related services.
[5] Setting up WebLogic Clusters.