Welcome!

Cloud Event Processing - Analyze, Sense, Respond

Colin Clark

Subscribe to Colin Clark: eMailAlertsEmail Alerts
Get Colin Clark via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Open Source and Cloud Computing, Cloud Computing Newswire, Cloud Computing for SMBs, CEP on Ulitzer, Big Data on Ulitzer

BigData: Blog Feed Post

Why CEP in the Cloud Makes Sense

So What is Really Cool About CEP?

CEP isn’t really about low latency.  The ability to do things quickly is important, just as in any system – especially those systems that grow and need to handle a lot of information.  Doing things quickly means doing things efficiently.  And doing things efficiently means less money spent on hardware.  Theoretically anyway.

SO WHAT IS REALLY COOL ABOUT CEP?

CEP gives one the to submit queries like “select symbol, avg(shares) from trade_stream group by symbol over 5 minutes emit every 1 minute.”  The CEP engine would consume this query, and then start returning an average of shares per trade for each symbol over the last 5 minutes, and it would then update that every 1 minute.  Granted, this is a very simple query, but the point here is that the queries are continuous.  That means that they’re submitted to the CEP server and they run until they’re told not to run any more.  So as the CEP engine continues to consume events, the queries keep running and producing results.

I ONLY WANT TO SEE WHAT I’M INTERESTED IN

So, if you were interested in various things, like when the sentiment regarding a certain theme hit a certain level in Twitter or a certain theme hit a certain level in Twitter and a related stock either increased or decreased in price and volume, you could submit those queries to the CEP server and get results back when those conditions occurred.  CEP engines also typically provide pattern matching capabilities; like if B happens within 5 minutes of A happening, let me know.

RESOURCES AND MEMORY

If you’re querying a lot of data, or your time windows are large, you may need a lot of memory and a lot of CPU.  Let’s paint a scenario where you’re looking at real time sales from a lot of different stores.  And you’d like to slice and dice that information by many dimensions, and do it real time with CEP based continuous queries.  Great – that’s a perfect use case for CEP.  But depending upon how much data you’ve got and how much compute is required to roll everything up for analysis and subsequent drill down, and how many users you’ve got running these queries, you might just run out of cpu or memory.

WAITER? ANOTHER ROUND OF FRESH RESOURCES PLEASE

My definition of ‘cloud’ includes elastic resource.  That means when you need more storage, compute, etc. you ask for it and it arrives, almost magically.  And then using that new resource, you can expand your ability to perform some set of tasks.  As in the above paragraph, we might add more compute if we added more users, more high velocity big data, or more and more complex queries.  Adding additional virtual machines in the cloud is a perfect way to address this.

SO WHAT’S THE PROBLEM?

Well, CEP engines aren’t designed that way.  For the most part anyway.  If you want this kind of ability, you’ve basically got to assemble all of this yourself – using a variety of vendors and products.  Basic questions like, “How do end users enter queries?  ”How are users notified when the things that they’re interested in occur?” typically involve multiple products from multiple vendors and very expensive professional services; either from the vendor or a 3rd party.  And here’s something else to consider – vendors selling software licenses don’t really want to build your system.  Complex accounting rules don’t let software vendors realize license revenue until the project is complete and you’ve accepted the solution.  Also, just because someone knows how to build a CEP engine doesn’t really mean they know how to build the kind of system we’ve described above; 100’s (maybe even 1,000’s) of users, dynamic queries flying all over the place, easy to use GUI’s, or know anything about how to set all of this up to use elastic resources.  What happens if you go out and buy all of this hardware to support your solution and it flops?  Well, you’re out the hardware costs then, aren’t you?

WHY AM I WRITING THIS?

In the near future, you’re going to start seeing a new style of deploying CEP based applications.  CEP based applications incorporating streaming map/reduce functionality and RIA based graphical front ends.  And these applications will allow hundreds of users to analyze high velocity streaming big data.  And do it all very, very quickly.  And do it in the cloud. All of the things that most CEP vendors would tell you is just simply not possible.  Except this vendor.

AS ALWAYS

Thanks for reading!

Read the original blog entry...

More Stories By Colin Clark

Colin Clark is the CTO for Cloud Event Processing, Inc. and is widely regarded as a thought leader and pioneer in both Complex Event Processing and its application within Capital Markets.

Follow Colin on Twitter at http:\\twitter.com\EventCloudPro to learn more about cloud based event processing using map/reduce, complex event processing, and event driven pattern matching agents. You can also send topic suggestions or questions to colin@cloudeventprocessing.com