When we set-up a system that needs to be highly available, it is good to first analyze how your environment is structured. That is look at what components in the environment can be treated in a different way, such that we can set-up a high available environment in a cost-effective way.

Say, we have an environment in which we are using JMS and a Data Source in one transaction (for the transactions we will use JTA). Now, we have certain options in this environment, say, for example, it is decided that the a message must be processed no matter what. In this case we have to set-up a JMS environment that can persist its messages in case of failures. To persist messages we have two options. It can either be persisted into a database or we can persist it on a file system. Note that in both cases the persistence environment must be highly available as well. To top this we also want that the transactions can go on in the case of server failures. Note that in this case we are dealing we a very complex environment, i.e.,

  • The system must support global transactions between local and distributed resources such as JMS destinations and databases.
  • JMS messages are persisted to the file system and must support fail-over and be highly available.
  • Fail-over of both the node and any WebLogic Server instance specific functionality, such as JMS destinations and JTA transaction recovery, must take place transparently.
  • Distributed transactions must be recoverable and restarted in case of node or WebLogic Server failure.

Data, for example, is only one component of availability; all layers of the system must be available and resilient to failures, which usually means that we have to provide redundancy at all layers (shared storage, operating system, network).

Usually it is possible to reduce the availability requirements for certain components in the environment, which can lead to much simpler cluster, for example,

  • No disk-sharing requirements across the cluster.
  • Data storage does not require high availability.
  • Applications do not use, or participate in, XA transactions because the transaction logs will typically require high availability and fail-over
    • In certain situations last agent optimization may be an appropriate way to relax this constraint.
  • The applications do not use file-based JMS persistent messages because the JMS message stores will typically require high availability and fail-over.

When building (and deploying) an application we can also ask: Are there parts that can be treated differently? For example, split the application into parts that need to be highly available and parts that need to be continuously available. In the case of highly available systems we can live with a down time of a few hours. Continuous available systems, on the other hand, must be up and running within minutes, which means we need to have easily manageable system components. When the application is data-centric it pays off to analyze how data is managed. What type of data model do we need? Do we need a tabular model, i.e., pick data apart and re-assemble it in different ways for different purposes or are there other wishes?

In the example above, the distinction between highly and continuous availability, when we deploy the application as a whole we end-up with a very complex environment in which all the layers (storage etcetera) must be treated as to be continuous available. Here, we started to look if we could design the architecture in such a way that components can be treated differently. A way to treat the components differently is by looking at how data is used. Certain parts queried lots of data in various ways, but with very few transactions. Another point that could be made on these components is that they are not mission critical, such that they can be treated to be only highly available (we can live with a downtime of a few hours). The mission critical component is highly transactional and just queries by id. In the latter case you can ask yourself is a database the ideal solution? What we did is let the mission critical component communicate to a data-grid (in this case Coherence) instead of the database.

Some time later we came across a term called Polyglot Persistence, and started to think hey there is a term for what we are doing. The philosophy of Polyglot Persistence is using multiple data storage technologies, chosen based upon the way data is being used by individual applications. By looking at component boundaries we can choose a particular storage technology chosen for the way the data is manipulated.

When using, for example, a data-grid other components that use the data must be able to deal with a paradigm called eventual consistency. For example, in the case of data-grid such as Coherence, we can set-up cache-store that eventually put the data into a data store. Other applications that use the data store must some how deal with stale data, as new data is held in the data-grid that is written to the data store when the time is ripe. In the case of a traffic control system that notifies passengers with travel information, this is not really a problem, but in a health monitoring system of hospitals it is. Thus, based on the system to be build we can decide what type of persistence best fits our needs and also set-up a cost-effective environment.

Coherence Course

One downside of using, for example, a data-grid such as Coherence is that developers must learn a new interface. To make this possible they asked to set-up a course in Coherence. To let everybody in on something as powerful (and beautiful) as Coherence, I decided to make the material public. In the course you will learn the following:

  • Introduction
  • Object Serialization
  • Configuring the Cache
  • Querying the Cache
  • Cache-aside Architecture
  • Read-through / Write-behind Architecture
  • Cache Events and Parallel Processing

Note that this is a course you can do at your own pace. The exercises lead you step-by-step through the process of building applications that uses Coherence. The material consists of the following:

References

[1] The future is Polyglot Persistence.