Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70...

Post on 13-Oct-2020

0 views 0 download

Transcript of Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70...

Ariel Rabkin Princeton University

asrabkin@cs.princeton.edu

Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area

Work done with Matvey Arye, Siddhartha Sen, Vivek S. Pai, and Michael J. Freedman

Today’s Analytics Architectures

2

� Backhaul is inefficient and inflexible

MillWheel (Google) Storm

Tomorrow’s Architecture: JetStream

3

� Backhaul is inefficient and inflexible � Goal: optimize use of WAN links by

exposing them to streaming system.

JetStream

Backhaul is Intrinsically Inefficient

4

Time [two days]

Ban

dwid

th

Available

Buyer’s remorse: wasted bandwidth

Analyst’s remorse: system overload or missing data

Needed for backhaul

Stream Processing Basics

5

Filtering (count > 100) Sampling (drop 90% of data) Image Compression

Quantiles (95th percentile) Query stored data

Site A

Some Operators in JetStream:

Stream Operators

Inpu

t Dat

a

Stream Operators

Inpu

t Dat

a

Stream Operators

Stream Operators

Site B

Stream Operator

Site C

The JetStream System

What: Streaming with aggregation and degradation as first-class primitives

Where: Storage and processing at edge

Why: Maximize goodput using aggregation and degradation

How: Data cubes and feedback control

6

An Example Query

7

How popular is every URL?

Requests Requests CDN

Requests

Requests Requests CDN

Requests

Mechanism 1: Storage with Aggregation

8

Requests Requests CDN

Requests

Requests Requests CDN

Requests Every minute, compute request counts by URL

Local Aggregation and Storage

Local Aggregation and Storage

Mechanism 2: Adaptive Degradation

9

Requests Requests CDN

Requests

Requests Requests CDN

Requests Every minute, compute request counts by URL

Local Aggregation and Storage

Local Aggregation and Storage

Adjustable Filtering

Adjustable Filtering

Requirements for Storage Abstraction

10

�  Update-able (locally and incrementally)

Data Data Merged Representation

+ =

Data Data

�  Merge-able (without accuracy penalty)

�  Data size is reducible (with predictable accuracy cost)

Stored Data += Data

The Data Cube Model

Aggregation used for: � Updates � Roll-ups � Merging cubes � Summarizing cubes

11

Counts by URL 12:00 12:01 12:02

www.mysite.com/a 3 5 0

www.mysite.com/b 0 2 0

www.yoursite.com 5 4 …

www.her-site.com 8 12 …

Cube: A multidimensional array, indexed by a set of dimensions, whose cells hold aggregates.

Cubes have aggregation function: Agg( , )à

Cubes can be “Rolled Up”

12

Counts by URL 12:00 12:01 12:02

www.mysite.com/a 3 5 0

www.mysite.com/b 0 2 0

www.yoursite.com 5 4 …

www.her-site.com 8 12 …

Cube: A multidimensional array, indexed by a set of dimensions, whose cells hold aggregates.

Counts by URL *

www.mysite.com/a 8

www.mysite.com/b 2

www.yoursite.com 9

www.her-site.com 20

Counts by URL 12:00 12:01 12:02 * 16 23 …

Cubes Unify Storage and Aggregation

13

Stored Data Update

Update

Update

Update sent downstream

Standing Query

One-off query

Feedback control

Degradation: The Big Picture

14

Local Data Dataflow

Operators Summarized or Approximated

Data

�  Level of degradation auto-tuned to match bandwidth. �  Challenge: Supporting mergeability and flexible policies

Network Dataflow Operators

Mergeability Imposes Constraints

�  Insight: Degradation may be discontinuous

01 - 10 11 - 20 Every 10 21 - 30

01 - 30 Every 30??

01 - 05 06 - 10 11 - 15 16 - 20 21 - 25 Every 5 26 - 30

01 - 06 07 - 12 13 - 18 19 - 24 Every 6 25 - 30

15

??????

02 - 06 07 - 11 12 - 16 17 - 21 22 - 26 Every 5 27 - 31

There Are Many Ways to Degrade Data

16

�  Can coarsen a dimension

�  Can drop low-rank values

5s minute 5 m hour dayAggregation time period

1

2

4

8

16

32

64

128

256

Savi

ngs

from

Agg

rega

tion

Domains

Coarsening Does Not Always Help

17

5s minute 5 m hour dayAggregation time period

1

2

4

8

16

32

64

128

256

Savi

ngs

from

Agg

rega

tion

DomainsURLs

Degradations Have Trade-offs

18

Name Fixed BW Savings

Fixed Accuracy cost

Parameter

Dim. Coarsening Usually no Yes Dimension Scale

Drop values (locally)

Yes No Cut-off

Drop values (globally)

No, multi-round protocol

Yes Cut-off

Audiovisual downsampling

Yes Yes Sample rate

Histogram Coarsening

Yes

Yes

Number of Buckets

A Simple Idea that Does Not Work

�  We have sensors that report congestion…. �  Have operators read sensor and adjust themselves?

19

Coarsening Operator

Incoming data Network Sampled

Data

Sending 4x too much

A Simple Idea that Does Not Work

�  We have sensors that report congestion…. �  Have operators read sensor and adjust themselves?

20

Coarsening Operator

Incoming data Network Sampled

Data

Sending 4x too much

Increase aggregation period up to 10 sec. If

insufficient, use sampling

Challenge: Composite Policies

�  Chaos if two operators are simultaneously responding to the same sensor

21

Coarsening Operator

Incoming data Network

Sampling Operator

Sending 4x too much

Interfacing with Operators

22

Shrinking data by 50% Possible levels:

[0%, 50%, 75%, 95%, …]

Go to level 75%

Coarsening Operator

Incoming data Network

Sampling Operator

Controller

Sending 4x too much

Experimental Setup

23

80 nodes on VICCI testbed at three sites (Seattle, Atlanta, and Germany)

Policy: Drop data if insufficient BW

Princeton

0 20 40 60 80 100 120 140Experiment time (minutes)

0

200

400

600

800BW

(Mbi

ts/s

ec)

Without Degradation

24

Drop BW

0 20 40 60 80 100 120 140Elapsed time (minutes)

0

200

400

600

800

1000

Late

ncy

(sec

)

0 20 40 60 80 100 120 140Elapsed time (minutes)

0

200

400

600

800

1000

Late

ncy

(sec

)

0 20 40 60 80 100 120 140Elapsed time (minutes)

0

200

400

600

800

1000

Late

ncy

(sec

)

Median Latency

95th percentile latency

Maximum latency

0 10 20 30 40 50 60 70 80 90Experiment time (minutes)

0

100

200

300

400BW

(Mbi

ts/s

ec)

Degradation Keeps Latency Bounded

25

Bandwidth Shaping

0 10 20 30 40 50 60 70 80 90Elapsed time (minutes)

0

5

10

15

20

Late

ncy

(sec

)

0 10 20 30 40 50 60 70 80 90Elapsed time (minutes)

0

5

10

15

20

Late

ncy

(sec

)

Median Latency

95th percentile latency

0 10 20 30 40 50 60 70 80 90Elapsed time (minutes)

0

5

10

15

20

25

30

35

40

Late

ncy

(sec

)Showing maximum latencies

26

Median Latency

95th percentile latency

Maximum Latency

Programming Ease

27

Scenario Lines of code Slow requests 5 Requests by URL 5

Bandwidth by node 15 Bad referrers 16 Latency and size quantiles 25 Success by domain 30 Top 10 domains by period 40

Big Requests 97

Conclusions and Future Work

�  Useful to embed aggregation and degradation abstractions in streaming systems.

�  Aggregation can be unified with storage.

�  System must accommodate degradation semantics.

�  Open questions: � How to guide users to the right degradation policy? � How to embed abstractions in higher-level language?

28