Zühlke Meetup - Mai 2017

46
IoT-Daten: Mehr und schneller ist nicht automatisch besser Dr. Boris Adryan Head of IoT & Data Analytics @BorisAdryan

Transcript of Zühlke Meetup - Mai 2017

Page 1: Zühlke Meetup - Mai 2017

IoT-Daten: Mehr und schneller ist nicht automatisch besser

Dr. Boris Adryan Head of IoT & Data Analytics

@BorisAdryan

Page 2: Zühlke Meetup - Mai 2017

Nachfolgende 4 Abbildungen aus: Abschlussbericht Arbeitskreis Industrie 4.0

Page 3: Zühlke Meetup - Mai 2017
Page 4: Zühlke Meetup - Mai 2017

Vertikale Integration: Entlang der gesamten Wertschöpfungskette

Horizontale Integration: Vernetztes Produktionssystem

Page 5: Zühlke Meetup - Mai 2017
Page 6: Zühlke Meetup - Mai 2017

• Internet-Verbindung • Datenintegration • kollektive Analyse • Reaktionsfähigkeit

aus: Technical Foundations of IoT

fast nebensächlich

das macht das IoT aus!

Page 7: Zühlke Meetup - Mai 2017

IoT cost expectations

many sensors + complicated analytics + expensive infrastructure —————————————— IoT has little benefit

“…because my data scientist said the more the better ”

Page 8: Zühlke Meetup - Mai 2017

39% of survey participants are worried about the cost of an industrial IoT solution.

“Why aren’t you doing IoT?”

Page 9: Zühlke Meetup - Mai 2017

peanuts: “a spoon full”

How many peanuts are that on average?

0 50 100

“on average”

3 samples

Page 10: Zühlke Meetup - Mai 2017

Do I get more peanuts at Maxie Eisen or at Logenhaus?

0 50 100

“on average” Maxie Eisen 3 samples

“on average” Logenhaus

Page 11: Zühlke Meetup - Mai 2017

0 50 100

4 samples

Do I get more peanuts at Maxie Eisen or at Logenhaus?

“on average” Maxie Eisen

“on average” Logenhaus

Page 12: Zühlke Meetup - Mai 2017

0 50 100

n samples

statistical power through large numbers of samples

deviation

Do I get more peanuts at Maxie Eisen or at Logenhaus?

“on average” Maxie Eisen

“on average” Logenhaus

Page 13: Zühlke Meetup - Mai 2017

Statisticians and data scientists LOVE larger sample sizes!

…but if sampling costs time and resources, we need a compromise.

Page 14: Zühlke Meetup - Mai 2017

Zühlke Data Analytics Framework

Page 15: Zühlke Meetup - Mai 2017

precision and accuracy that can be achieved

theoretically

Sampling strategy

precision and accuracy that is needed to get

a job done

accurate and precise

not accurate, but precise

accurate, not precise

not what you want

Page 16: Zühlke Meetup - Mai 2017

• how to cut down on hardware costs

• how to cut down on software costs

Sweetening IoT for your customer

A few recommendations from the trenches:

many sensors + complicated analytics + expensive infrastructure —————————————— IoT has little benefit

less

reasonable

Page 17: Zühlke Meetup - Mai 2017

IoT - is it worth it?

The upgrade of a ‘dumb’ asset to a ‘smart’ asset is an investment.

time, money

Page 18: Zühlke Meetup - Mai 2017

Asset monitoring

base

Monday

WednesdayTraditional process

• small maintenance task (if needed)

• weekly site visits to all assets

• two independent tours • time to reach asset is

main contributor to cost • traffic-dependent

Page 19: Zühlke Meetup - Mai 2017

Data sources

Let’s assume the future isn’t going to be much different than the past…

• log from past site visits: approx. likelihood for maintenance • a collection of traffic data that’s somewhat representative

Page 20: Zühlke Meetup - Mai 2017

Log from previous visits

Monday tours

Wednesday tours

Page 21: Zühlke Meetup - Mai 2017

Maintenance likelihood

• test for dependency between Monday and Wednesday tours

none

• test for dependency within tours

none

The assumption of temporal uniformity is reasonable.

Page 22: Zühlke Meetup - Mai 2017

Monte Carlo simulations

p1(need today)

patterns for a demand-driven tour

‘cost function’: sum of edges

base

default tour

base

p2(need today)

p3(need today)

p4(need today)

p5(need today)

p6(need today)

Page 23: Zühlke Meetup - Mai 2017

Travelling salesman problem

what’s the most reasonable tour from to , visiting all ?

heuristic search is good enough, but requires a distance matrix

Page 24: Zühlke Meetup - Mai 2017

Traffic harvesting

• based on Google API

• generate a distribution of travel times for each edge in the graph, dependent on time of day (weekdays only)

Page 25: Zühlke Meetup - Mai 2017

IoT - is it worth it?

cost

awaiting confirmation!

weeks

cost

weeks

Page 26: Zühlke Meetup - Mai 2017

Westminster Parking Trial

https://www.westminster.gov.uk/new-trial-improve-conditions-disabled-drivers

IoT solution

Service company

~750 independent parking lots with a total of

>3,500 individual spaces

access to

Page 27: Zühlke Meetup - Mai 2017

Humans don’t scale that well…

labour: expensive

sensor: cheap

While the cost of the sensors is falling (and follows Moore’s Law), digging them in and out for deployment and maintenance is a significant cost factor.

Page 28: Zühlke Meetup - Mai 2017

Can we learn an optimal deployment and sampling pattern?

•sampling rate of 5-10 min •data over 2 weeks in May 2015 •overall 2.6 million data points

Can we make customers’ budget go further by • reducing the number of sensors in a geographic area? • lowering the sampling rate for better battery life?

Page 29: Zühlke Meetup - Mai 2017

A quick glimpse into the raw data

Page 30: Zühlke Meetup - Mai 2017

Correlation and clustering

0

5

10

15

20

0 3 6 9 12

“correlated”

0

5

10

15

20

0 3 6 9 12

“anti-correlated”

0

5

10

15

20

0 3 6 9 12

“independent”

lorry

coach

car

bike

skateboard

hierarchical clustering on the basis of a feature matrix

Page 31: Zühlke Meetup - Mai 2017

Good news: temporal occupancy pattern roughly predicts neighbours

lots in Southampton

lots around the corner of each other

750 parking lots

Page 32: Zühlke Meetup - Mai 2017

A caveat: Is a high-degree of correlation a function of parking lot size?

finding two lots of 20 spaces that correlate

finding two lots of 3 spaces that correlate

0:00 12:00 23:59

0:00 12:00 23:59

“more likely”

“less likely”

Page 33: Zühlke Meetup - Mai 2017

Bootstrapping in DBSCAN clusters

Simulation: Swap the occupancy vectors between parking lots of similar size and test per grid cell if these lots still correlate

Page 34: Zühlke Meetup - Mai 2017

What makes a good spatial cluster?

Page 35: Zühlke Meetup - Mai 2017

Density-Based Spatial Clustering of Applications with Noise (DBSCAN)

https://en.wikipedia.org/wiki/DBSCAN#/media/File:DBSCAN-Illustration.svg

2 parameters:

epsilon (distance) minPoints (in cluster)

A - core points B, C - corner points N - noise point

Page 36: Zühlke Meetup - Mai 2017

Stratification strategy

3 lots with cc > 0.5

2 spaces 4 spaces 4 spaces

Test:

1. Take occupancy profile of ONE random 2-space parking lot and TWO random 4-space parking lots.

2. Determine cc.

3. Repeat n times and get a cc distribution for that parking lot combination.

Page 37: Zühlke Meetup - Mai 2017

Combining stats with street knowledge

Page 38: Zühlke Meetup - Mai 2017

Suggested technology for trials

A temporary survey would have allowed us to make the same recommendation, including the insight that the provided 5’ resolution is probably not required.

Page 39: Zühlke Meetup - Mai 2017

• how to cut down on hardware costs

• how to cut down on software costs

Sweetening IoT for your customer

A few recommendations from the trenches:

many sensors + complicated analytics + expensive infrastructure —————————————— IoT has little benefit

less

reasonable

Page 40: Zühlke Meetup - Mai 2017

My current pet hate: Deep Learning

Deep learning has delivered impressive results mimicking human reasoning, strategic thinking and creativity.

At the same time, big players have released libraries such that even ‘script kiddies’ can apply deep learning.

It’s already leading to unreflected use of deep learning when other methods would be more appropriate.

Page 41: Zühlke Meetup - Mai 2017

“I need to do real-time analytics!”

microseconds to seconds

seconds to minutes

minutes to hours

hours to weeks

on device

on stream

in batch

am I falling? counteract

battery level should I land?

how many times did I

stall?

what’s the best weather for

flying?

in process

in database

operational insight

performance insight

strategic insight

e.g. Kalman filter

e.g. with machine learning

e.g. rules engine

e.g. summary stats

Page 42: Zühlke Meetup - Mai 2017

Can IoT ever be real-time?

zone 1:

real-time [us]

zone 2:

real-time [ms]

zone 3:

real-time [s]

Page 43: Zühlke Meetup - Mai 2017

Edge, fog and cloud computing

Edge Pro: - immediate compression from raw

data to actionable information - cuts down traffic - fast response

Con: - loses potentially valuable raw data - developing analytics on embedded

systems requires specialists - compute costs valuable battery life

Cloud Pro: - compute power - scalability - familiarity for developers - integration centre across

all data sources - cheapest ‘real-time’

option

Con: - traffic

Fog Pro: - same as Edge - closer to ‘normal’ development work - gateways often mains-powered

Con: - loses potentially valuable raw data

Page 44: Zühlke Meetup - Mai 2017

Some of our examples for real-time analytics

Choosing the appropriate method and toolset on every level.

Page 45: Zühlke Meetup - Mai 2017

Dr. Boris Adryan @BorisAdryan

‣ Preliminary surveys and data analysis can help to minimise the number of sensors and develop an optimal deployment strategy and sampling schedule.

‣ Super-fast analytics and state-of-the-art methods are not automatically the most useful solution.

‣ A good understanding on the type of insight that is required by the business model is essential.

Summary

Page 46: Zühlke Meetup - Mai 2017

mobile communications series

BORIS ADRYAN DOMINIK OBERMAIER PAUL FREMANTLE

IoT

THE TECHNICAL FOUNDATIONS OF

B O S T O N I L O N D O N

www.artechhouse.com

PMS Black PMS 7549

A RT E C H H O U S E

This comprehensive resource presents a technical introduction to the components, architectures, software, and protocols of IoT. This book was designed specifically for those interested in researching, developing, and building IoT. The book covers the physics of electricity and electromagnetism, laying the foundation for understanding the components of modern electronics and computing. Readers learn about the fundamental properties of IoT, along with security and privacy issues related to developing and maintaining connected products.

From the launch of the Internet from ARPAnet in the 1960s, to recent connected gadgets, this book highlights the integration of IoT in various verticals such as industry, smart cities, connected vehicles, and smart and assisted living. Overall design patterns, issues with UX and UI, and different network topologies related to architectures of M2M and IoT solutions are explored. Hardware development, power, sensors, and embedded systems are discussed in detail. This book offers insight into the software components that impinge on IoT solutions, their development, network protocols, backend software, data analytics, and conceptual interoperability.

Boris Adryan is the head of IoT & Data Analytics at Zuhlke Engineering (Germany) and the founder of thingslearn Ltd (UK). He holds a Ph.D. in genetics from the Max Planck Institute for Biophysical Chemistry, and led academic research as a Royal Society University Research Fellow at the University of Cambridge.

Dominik Obermaier is the cofounder and CTO at dc-square company, where he created the HiveMQ MQTT broker. He received his B.Sc. in computer science from the University of Applied Sciences Landshut.

Paul Fremantle cofounded WSO2, where he was instrumental in creating the Carbon middleware platform. He studied mathematics, philosophy and computing at Oxford University, gaining B.A. and M.Sc. degrees. He is currently pursuing his Ph.D. at the University of Portsmouth, focusing on security and privacy of IoT.

mobile communications series

THE TEC

HN

ICA

L FOU

ND

ATIO

NS O

F IoTA

DR

YAN

• O

BER

MA

IER •

FREM

AN

TLEInclude bar code

ISBN 13: 978-1-63081-025-2ISBN: 1-63081-025-8

erscheint Juni oder Juli