Quite often, we get caught in the technical details of these discussions and lose sight of what this all really means.
If all you are looking for is to collect data streams and simply update counters, then both approaches would work. The main difference between the two is felt in the level and complexity of processing that you would like to process in real-time. If you want to continuously update a different form of sorted lists or indexes, you’ll find that doing so in an event-driven approach, as is the case of Twitter, can be exponentially faster and more efficient than the logging-centric approach. To put some numbers behind that, Twitter reported that calculating the reach without Storm took 2 hours whereas Storm could do the same in less than a second.
Such a difference in speed and utilization have a direct correlation with the business bottom line, as it determines the level and depth of intelligence that it can run against its data. It also determines the cost of running the analytics systems and, in some cases, the availability of those systems. When the processing is slower there would be larger number of scenarios that could saturate the system.