TSAR (the TimeSeries AggregatoR), - how to count tens of billions of daily events in real time using open source technologies

12:45 PM - 01:15 PM on August 16, 2014, Room 702

Anirudh Todi

Audience level:
intermediate
Category:
Data Analysis
Watch:
http://youtu.be/XzHokP3NZu4

Description

Twitter depends heavily on real-time event aggregation. Classic timeseries applications include site traffic, service health, and user engagement monitoring; these are increasingly complemented by a range of products and features that surface aggregated timeseries data directly to end users. Services that power such features need to be resilient enough to ensure a consistent user experience, flexible enough to accommodate a rapidly changing product roadmap, and able to scale to tens of billions of events per day.

Experience has shown that truly robust real-time aggregation services are hard to build; scaling and evolving them gracefully is even harder; and, moreover, many timeseries applications call for essentially the same architecture, with slight variations in the data model. Solving this broad class of problems at Twitter has been a multiyear effort. In previous talks we have introduced Summingbird, a high-level abstraction library for generalized distributed computation, which provides an elegant descriptive framework for complex aggregation problems. In this talk, I will describe how we built a flexible, reusable, end-to-end service architecture on top of Summingbird, called TSAR (the TimeSeries AggregatoR).

TSAR uses Python to provide an service toolkit that integrates with essential services that provide data processing, data warehousing, query capability, observability, and alerting, automatically configuring and orchestrating its components.

Abstract

Twitter's 250 million users generate tens of billions of tweet views per day. Aggregating these events in real time - in a robust enough way to incorporate into our products - presents a massive scaling challenge. In this talk I'll introduce TSAR (the TimeSeries AggregatoR), a robust, flexible, and scalable service for real-time event aggregation designed to solve this problem and a range of similar ones. I'll discuss how we built TSAR using Python and Scala from the ground up, almost entirely on open-source technologies (Storm, Summingbird, Kafka, Aurora, and others), and describe some of the challenges we faced in scaling it to process tens of billions of events per day.