CEP, Distributed Caches, In-Memory Databases – these (as vendor supplied products) are all ‘newish’ technologies that have some rather interesting overlaps. In addition, there are aspects of these technologies that could definitely be used together for many problems going forward – but those are posts for another day.
The way I like to think about CEP is if you have high volume/low latency inbound data that you would like to apply ‘rules’ to, that then can spin out ‘actions’ based upon matching said rules. These ‘actions’ can also be inbound flows into other sets of rules, and so on, forming EPNs (Event Processing Networks).
There are certainly aspects of this that start to sound an awful lot like SOA/ESB at a high level. If SOA/ESB is the flow of information between systems and/or system groups, CEP is more micro – its value is literally at the more granular level. The emphasis (and the reward) is ‘ high volume/low latency’.
Whenever someone states words like high usage, high volume, low latency, etc., one should always ask the values to be quantified. A while ago, I thought 8,000 concurrent users on one of my web products was ‘high usage’, then I was handed a different project whose initial number of concurrent users was 250,000. It’s all a matter of scale. ‘High volume’ in regards to CEP technology two years ago meant processing >65,000 events/sec. Nowadays it can mean processing millions of events/sec. Average latencies can be in the microseconds, and it is common to have SLAs – guaranteed minimum latencies – in the single digit milliseconds.
To process that level through a single server requires architecting things differently than most general trends over the past decade. For a large body of problems, extensive threading is the way to go. Whether you talk about high number of connections per server, pooling, multi-threading processes, etc., the predominant coding model today is lots of threads. The one area where too many threads hurts is if you loose too much time on context switching between the threads. For a much smaller class of problems, limiting the number of threads is better for overall throughput rates. Processing business logic on 65,000 events/sec on a single server is such an example.
CEP vendor solutions start from the basis that you want to minimize and easily tightly control the threading in your system. Part of the value (and ROI) is not having to code all that infrastructure to easily do so. The other part of the ROI is the obvious one – easily defining inbound adapters for different data sources, easily defining business logic in rules, and easily defining user defined actions based on rule hits.
So in short, if some part of your system benefits from easily being able to ingest tens of thousands of events per second or more on a single server and be able to quickly and easily define sets of business logic – especially if you may want to change said rules easily – and perform actions based on those rule hits, then take a look at CEP.