Minimum High Availability
High availability protects from lost data when access to the data resources and critical business applications is disrupted. These problems can be occurred due to “Human Error”, “Hardware failure” and “Software problems”.
WSO2 Stream Processor supports a deployment scenario that has focused on high availability (HA) along with HA processing.When using WSO2 Stream Processor (SP) to process streaming events, there is always a chance that the node running the SP instance may fail due to several unpredictable reasons as I mentioned earlier.
This will result in a loss in the events being streamed until the node could be restarted. A solution for this is to use the WSO2 SP in a High Availability (HA) environment where the processing of events is not halted at an unexpected failing scenario.
Deployment
A node is said to be inactive when it fails to return a heartbeat pulse twice at a specified time interval. If an inactive node is identified, then it will remove from the cluster. If that node becomes active again and rejoins the cluster, it assigns a new ID.The cluster is considered as inactive when all the nodes in the clusters are inactive.
HA Deployment
HA deployment for WSO2 SP needs minimum two Nodes, which means we have to run two instances of WSO2 SP in parallel as illustrated in the diagram below.
For this deployment, both the SP nodes should be configured to receive all events. To achieve this, clients should send all the requests to both the nodes and the user has to specify that events are duplicated in the cluster.
In order to achieve this we can use the following strategies :
- Using a Siddhi application with a distributed sink to deliver duplicated events,
- If using WSO2 products, WSO2 data bridge can be used to publish events to both nodes,
- If using a message broker like kafka, both nodes can subscribe to the same topic to receive duplicate events.
In this scenario, one SP node works in active mode and the other works in passive mode. If the active node fails, the other node becomes active and receives all the requests. When the failed node is up again, it fetches all the internal states of the current active node via syncing.
The newly arrived node, then becomes the passive node and starts processing all the incoming messages to keep its state synced with the active node so that it can become active if the current active node fails.
(Full disclosure: I wrote this blog during my internship at WSO2 from November 2017 to April 2018)