Event-Processing-und-BigData-kombiniert-guido_schmutz
-
Upload
trivadis -
Category
Technology
-
view
297 -
download
0
description
Transcript of Event-Processing-und-BigData-kombiniert-guido_schmutz
Guido Schmutz | Trivadis
Event-Processing und Big Data kombiniert, geht das?
2013 © Trivadis
BASEL BERN BRUGG LAUSANNE ZUERICH DUESSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA
2013 © Trivadis
Event-Processing und Big Data kombiniert, geht das? Guido Schmutz
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
INFOBOX – Read and delete • A heading and an optional sub-heading
can be placed on the first slide. • The title is written directly under the
name (Shift+Return) • If multiple speakers are named, please
just write the names one underneath the other (there is normally no space for titles, etc.)
2
2013 © Trivadis
Guido Schmutz
Working for Trivadis for more than 16 years
Oracle ACE Director for Fusion Middleware and SOA Co-Author of different books Consultant, Trainer Software Architect for Java, Oracle, SOA and EDA Member of Trivadis Architecture Board Technology Manager @ Trivadis More than 20 years of software development experience Contact: [email protected] Blog: http://guidoschmutz.wordpress.com Twitter: gschmutz
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
3
2013 © Trivadis
Trivadis is a market leader in IT consulting, system integration, solution engineering and the provision of IT services focusing on and technologies in Switzerland, Germany and Austria.
We offer our services in the following strategic business fields: Trivadis Services takes over the interacting operation of your IT systems.
Our company
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
O P E R A T I O N
4
2013 © Trivadis
With over 600 specialists and IT experts in your region
24.02.2014 5
12 Trivadis branches and more than 600 employees 200 Service Level Agreements Over 4,000 training participants Research and development budget: CHF 5.0 / EUR 4 million Financially self-supporting and sustainably profitable Experience from more than 1,900 projects per year at over 800 customers
Hamburg
Düsseldorf
Frankfurt
Freiburg München
Wien
Basel Zurich Bern
Lausanne
Stuttgart
Brugg
Event-Processing und Big Data kombiniert, geht das? 5
2013 © Trivadis
AGENDA
1. Big Data and Fast Data, what is it?
2. Motivation
3. The Lambda Architecture
4. Implementing the Lambda Architecture
5. Demo – Event Processing with Oracle OEP
6. Summary
INFOBOX – Read and delete • If the agenda is used as an interim
page, please highlight the relevant chapter in red font.
• To allow optimum alignment of objects,
display the drawing guides (right-click on the page)
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
6
2013 © Trivadis
Big Data Definition (4 Vs)
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
+ Time to action ? – Big Data + Event Processing = Fast Data
Characteristics of Big Data: Its Volume, Velocity and Variety in combination
7
2013 © Trivadis
The world is changing …
The model of Generating/Consuming Data has changed ….
Old Model: few companies are generating data, all others are consuming data
New Model: all of use are generating data, and all of us are consuming data
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
8
2013 © Trivadis
Who is generating Big Data?
The progress and innovation is no longer hindered by the ability to collect data
But by the ability to manage, analyze, summarize, visualize and discover knowledge from the collected data in a timely manner and in a scalable fashion
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Social media and networks (all of us are generating data)
Scientific instruments (collecting all sorts of data)
Mobile devices (tracking all objects all the time)
Sensor technology and networks
(measuring all kinds of data)
9
2013 © Trivadis
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
10
2013 © Trivadis
Internet Of Things – Sensors are/will be everywhere
There are more devices tapping into the internet than people on earth
How do we prepare our systems/architecture for the future?
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Source: Cisco Source: The Economist 11
2013 © Trivadis
Data as an Asset - Store Anything?
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
But then data is just too valuable to delete! We must store anything!
Nonsense! Just store the data
you know you need today!
It depends … but Big Data technologies allow to store the raw information from both new data sources as well as existing ones so that you can later use it to create new data-driven products, you would not have thought about today!
12
2013 © Trivadis
Big Data vs. Traditional Enterprise Data
§ Big Data is not just “a lots more enterprise data”
§ Big Data is usually states, events, transactions etc. – not master data
§ Big Data is commonly generated outside of traditional enterprise applications but needs to be associated with it
§ Big Data is often composed of un(evenly)structured information types that continually arrive in enormous amounts
§ Data / Information as an Asset!
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
13
2013 © Trivadis
AGENDA
1. Big Data and Fast Data, what is it?
2. Architecting (Big) Data Systems
3. The Lambda Architecture
4. Implementing the Lambda Architecture
5. Demo – Event Processing with Oracle OEP
6. Summary
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
14
2013 © Trivadis
What is a data system?
• A (data) system that manages the storage and querying of data with a lifetime measured in years encompassing every version of the application to ever exist, every hardware failure and every human mistake ever made.
• A data system answers questions based on information that was acquired in the past
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
15
2013 © Trivadis
How do we build (data) systems today – Today’s Architectures
Source of Truth is mutable!
• CRUD pattern
What is the problem with this?
• Lack of Human Fault Tolerance
• Potential loss of information/data
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Mutable Database
Application (Query)
RDBMS NoSQL
NewSQL
Mobile Web RIA
Rich Client
Source of Truth
Source of Truth
16
2013 © Trivadis
Problems in today’s architecture/systems
Bugs will be deployed to production over the lifetime of a data system
Operational mistakes will be made
Humans are part of the overall system • Just like hard disks, CPUs, memory, software • design for human error like you design for any other fault
Examples of human error • Deploy a bug that increments counters by two instead of by one • Accidentally delete data from database • Accidental DOS on important internal service
Worst two consequences: data loss or data corruption
As long as an error doesn‘t lose or corrupt good data, you can fix what went wrong
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Lack of Human Fault Tolerance
17
2013 © Trivadis
Immutability vs. Mutability
The U and D in CRUD
A mutable system updates the current state of the world
Mutable systems inherently lack human fault-tolerance
Easy to corrupt or lose data
An immutable system captures historical records of events
Each event happens at a particular time and is always true
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Immutability restricts the range of errors causing data loss/data corruption
Vastly more human fault-tolerant
Conclusion: Your source of truth should always be immutable
18
2013 © Trivadis
A different kind of architecture with immutable source of truth
Instead of using our traditional approach … why not building data systems like this
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
HDFS NoSQL
NewSQL RDBMS
View on Data
Mobile Web RIA
Rich Client
Source of Truth
Immutable data
View on Data
Application (Query)
Source of Truth
19
2013 © Trivadis
How to create the views on the Immutable data?
On the fly ?
Materialized, i.e. Pre-computed ?
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Immutable data View
Immutable data
Pre- Computed
Views
Query
Query
20
2013 © Trivadis
Big Data Processing - Batch
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
HDFS
Data Store optimized for appending large
results
Queries
Stream 1
Stream 2
Event
Hadoop cluster Map/Reduce in Pig
Hadoop Distributed File System
21
2013 © Trivadis
Big Data Processing – Batch
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Immutable data
Batch View Query ? ? Incoming
Data
How to compute the batch views ?
How to compute queries from the views ?
22
2013 © Trivadis
Big Data Processing - Batch
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
1.2.13 Add iPAD 64GB 10.3.13 Add Sony RX-100 11..3.13 Add Canon GX-10 11.3.13 Remove Sony RX-100 12.3.13 Add Nikon S-100 14.4.13 Add BoseQC-15 15.4.13 Add MacBook Pro 15 20.4.13 Remove Canon GX10
iPAD 64GB Nikon S-100 BoseQC-15 MacBook Pro 15
4 derive derive
Favorite Product List Changes Current Favorite
Product List Current Product Count
Raw information => data Information => derived
23
2013 © Trivadis
Big Data Processing - Batch
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
§ Using only batch processing, leaves you always with a portion of non-processed data.
Fully processed data Last full batch period
Time forbatch job
time
now non-processed data
time
now
batch-processed data
Adapted from Ted Dunning (March 2012): http://www.youtube.com/watch?v=7PcmbI5aC20
But we are not done yet …
24
2013 © Trivadis
Big Data Processing - Adding Real-Time
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Immutable data
Batch Views
Query
? Data Stream
Realtime Views
Incoming Data
How to compute queries from the views ? How to compute real-time views
25
2013 © Trivadis
Big Data Processing - Adding Real-Time
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
1.2.13 Add iPAD 64GB 10.3.13 Add Sony RX-100 11..3.13 Add Canon GX-10 11.3.13 Remove Sony RX-100 12.3.13 Add Nikon S-100 14.4.13 Add BoseQC-15 15.4.13 Add MacBook Pro 15 20.4.13 Remove Canon GX10 Now Add Canon Scanner
iPAD 64GB Nikon S-100 BoseQC-15 MacBook Pro 15
5
compute
Favorite Product List Changes Current Favorite Product List
Current Product Count
Now Canon Scanner compute Add Canon Scanner
Stream of Favorite Product List Changes
Immutable data
Views
Data Stream
Query
incoming
26
2013 © Trivadis
Big Data Processing - Batch & Real Time
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
time
Fully processed data Last full batch period
now
Time forbatch job
batch processingworked fine here
(e.g. Hadoop)
real time processingworks here
blended view for end user
Adapted from Ted Dunning (March 2012): http://www.youtube.com/watch?v=7PcmbI5aC20
27
2013 © Trivadis
AGENDA
1. Big Data and Fast Data, what is it?
2. Architecting (Big) Data Systems
3. The Lambda Architecture
4. Implementing the Lambda Architecture
5. Demo – Event Processing with Oracle OEP
6. Summary
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
28
2013 © Trivadis
Lambda Architecture
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Immutable data
Batch View
Query
Data Stream
Realtime View
Incoming Data
Serving Layer
Speed Layer
Batch Layer
A
B C D
E F
G
29
2013 © Trivadis
Lambda Architecture
A. All data is sent to both the batch and speed layer
B. Master data set is an immutable, append-only set of data
C. Batch layer pre-computes query functions from scratch, result is called Batch Views. Batch layer constantly re-computes the batch views.
D. Batch views are indexed and stored in a scalable database to get particular values very quickly. Swaps in new batch views when they are available
E. Speed layer compensates for the high latency of updates to the Batch Views
F. Uses fast incremental algorithms and read/write databases to produce real-time views
G. Queries are resolved by getting results from both batch and real-time views
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
30
2013 © Trivadis
Lambda Architecture
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Stores the immutable constantly growing dataset Computes arbitrary views from this dataset using BigData technologies (can take hours) Can be always recreated Computes the views from the constant stream of data it receives Needed to compensate for the high latency of the batch layer Incremental model and views are transient Responsible for indexing and exposing the pre-computed batch views so that they can be queried Exposes the incremented real-time views Merges the batch and the real-time views into a consistent result
Serving Layer
Batch Layer
Speed Layer
31
2013 © Trivadis
AGENDA
1. Big Data and Fast Data, what is it?
2. Architecting (Big) Data Systems
3. The Lambda Architecture
4. Implementing the Lambda Architecture
5. Demo – Event Processing with Oracle OEP
6. Summary
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
32
2013 © Trivadis
Lambda Architecture
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Speed Layer
Precompute Views
query
Source: Marz, N. & Warren, J. (2013) Big Data. Manning.
Batch Layer
Precomputed information All data
Incremented information Process stream
Incoming Data
Batch recompute
Realtime increment
Serving Layer
batch view
batch view
real time view
real time view
Mer
ge
33
2013 © Trivadis
Lambda Architecture in Action
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Implementation in ongoing Proof-of-concept (after completion of phase 1)
Speed Layer
Precompute Views
query
Batch Layer
Precomputed information All data
Incremented information Process stream
Incoming Data
Batch recompute
Realtime increment
Serving Layer
batch view
batch view
real time view
real time view
Mer
ge
34
2013 © Trivadis
Lambda Architecture with Oracle Product Stack
Possible implementation with Oracle Product stack
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Speed Layer
Precompute Views
query
Batch Layer
Precomputed information All data
Incremented information Process stream
Incoming Data
Batch recompute
Serving Layer
batch view
batch view
real time view
real time view
Mer
ge
Oracle NoSQL
Oracle RDBMS Oracle Coherence
Oracle BigData Appliance
Oracle NoSQL Oracle Coherence
Oracle Event Processing Oracle GoldenGate
Oracle Data Integrator
Oracle GoldenGate
Oracle Event Processing
Oracle Service Bus
Ora
cle
Web
Log
ic S
erve
r O
racl
e A
DF
OBI
EE
Ora
cle
Ende
ca
Ora
cle
Big
Dat
a C
onne
ctor
s
BAM
35
2013 © Trivadis
AGENDA
1. Big Data and Fast Data, what is it?
2. Architecting (Big) Data Systems
3. The Lambda Architecture
4. Implementing the Lambda Architecture
5. Demo – Event Processing with Oracle OEP
6. Summary
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
36
2013 © Trivadis
Retrieve Tweets and Visualize
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
37
2013 © Trivadis
Access to Tweets
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Quelle
Source Limitations Cost Twitter’s Search API 3200 / user
5000 / keyword 180 requests / 15 minutes
free
Twitter’s Streaming API 1%-40% of total volume free
DataSift none 0.15 -0.20$ /
unit Gnip none On request
38
2013 © Trivadis
1) Creating a Twitter Adapter
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Twitter Adapter
Only 3 minutes remaining in the gold medalgame,
@HC_Men with a commanding 3-0 lead.
#CANvsSWE #TeamCanada #Sochi2014
39
2013 © Trivadis
2) Send Tweets to BAM
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Twitter Adapter
BAM Tweet
Only 3 minutes remaining in the gold medalgame, @HC_Men with a commanding 3-0 lead. #CANvsSWE
#TeamCanada #Sochi2014
Only 3 minutes remaining in the gold medalgame,
@HC_Men with a commanding 3-0 lead.
#CANvsSWE #TeamCanada #Sochi2014
JMS
40
2013 © Trivadis
3) Extract interesting information from Tweet
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Mention Extractor
Twitter Adapter
HashtagExtractor
Author Extractor
BAM Tweet
Only 3 minutes remaining in the gold medalgame,
@HC_Men with a commanding 3-0 lead.
#CANvsSWE #TeamCanada #Sochi2014
@hc_men
hockeycanada
#canvsswe #teamcanada
JMS
Only 3 minutes remaining in the gold medalgame, @HC_Men with a commanding 3-0 lead. #CANvsSWE
#TeamCanada #Sochi2014
#sochi2014
41
2013 © Trivadis
4) Count occurrences within period
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Mention Extractor
Twitter Adapter
CounterProcessor
HashtagExtractor
Author Extractor
BAM Tweet
BAM Counter Only 3 minutes remaining in
the gold medalgame, @HC_Men with a
commanding 3-0 lead. #CANvsSWE #TeamCanada
#Sochi2014
#canvsswe,5 #sochi2014,9
hockeycanada,1
@hc_men,1 #teamcanada,5
JMS
JMS
range 30 seconds slide 30 seconds
@hc_men
hockeycanada
#canvsswe #teamcanada
Only 3 minutes remaining in the gold medalgame, @HC_Men with a commanding 3-0 lead. #CANvsSWE
#TeamCanada #Sochi2014
#sochi2014
42
2013 © Trivadis
Implementing in Oracle Event Processing
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Mention Extractor
Twitter Adapter
CounterProcessor
HashtagExtractor
Author Extractor
BAM Tweet
BAM Counter
JMS
JMS
range 30 seconds slide 30 seconds
43
2013 © Trivadis
1) Creating Twitter Adapter – Connecting to Twitter Stream
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
44
2013 © Trivadis
1) Creating Twitter Adapter – Tweet Event
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
45
2013 © Trivadis
1) Creating Twitter Adapter – Adapter Factory
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
46
2013 © Trivadis
1) Creating Twitter Adapter – Assembly
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
47
2013 © Trivadis
1) Creating Twitter Adapter – Export Adapter to server
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
48
2013 © Trivadis
1) Creating Twitter Adapter – Using Twitter Adapter
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
49
2013 © Trivadis
2) Sending Tweets to BAM
Using Oracle BAM Enterprise Message Sources (JMS) interface
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
50
2013 © Trivadis
2) Sending Tweets to BAM – Convert events to JMS MapMessage
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
51
2013 © Trivadis
3) Extract information from Tweet – Extract Hashtags from TweetEvent
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
52
2013 © Trivadis
3) Extract information from Tweet – Extract Hashtags from TweetEvent
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
53
2013 © Trivadis
4) Count occurrences within period – Using CQL
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
54
2013 © Trivadis
Implementation – Complete Picture
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
55
2013 © Trivadis
Oracle BAM: Architected for Integration and Visualization
Event-Processing und Big Data kombiniert, geht das?
Internet
BAM Dashboards
WebApplications
StartPage
ActiveViewer
ActiveStudio
Architect
Administrator
ReportServer
iCommand
Oracle Database (Grid)
BAM Data & Metadata
External Data Objects
WebServices
Internet
Enterprise Integration Framework
Application Server
BI
Web Services
JMS Connector
BAM Adapter
ADF
BAM DataControl
ADF Pages with DVT
BAM Server EventEngine
Actions & Escalations
Notification Services
ReportCache
Snapshots & Change Lists
Memory / Disk
ActiveDataCache
ViewSets
API
Kernel
DataSets
DataStorageEngine ODI
Databases
OLTP & Data Warehouses
Mobile Devices
Data & Metadata Import & Export
BPEL
BPM
Message Queues
CEP
OESB
24.02.2014
56
2013 © Trivadis
Oracle BAM – Create a Data Object
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
57
2013 © Trivadis
Oracle BAM Enterprise Message Source Configuration
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
58
2013 © Trivadis
5) Adding Cassandra NoSQL for storing results
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
Mention Extractor
Twitter Adapter
CounterProcessor
HashtagExtractor
Author Extractor
Cassandra Counter
BAM Tweet
Cassandra Tweet
BAM Counter
Only 3 minutes remaining in the gold medalgame, @HC_Men with a commanding 3-0 lead. #CANvsSWE
#TeamCanada #Sochi2014
Only 3 minutes remaining in the gold medalgame,
@HC_Men with a commanding 3-0 lead.
#CANvsSWE #TeamCanada #Sochi2014
JMS
JMS
range 30 seconds slide 30 seconds
Only 3 minutes remaining in the gold medalgame, @HC_Men with a commanding 3-0 lead. #CANvsSWE
#TeamCanada #Sochi2014
#canvsswe,5 #sochi2014,9
hockeycanada,1
@hc_men,1 #teamcanada,5
@hc_men
hockeycanada
#canvsswe #teamcanada #sochi2014
59
2013 © Trivadis
AGENDA
1. Big Data, what is it?
2. Architecting (Big) Data Systems
3. The Lambda Architecture
4. Implementing the Lambda Architecture
5. Demo – Event Processing with Oracle OEP
6. Summary
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
60
2013 © Trivadis
Summary – The lambda architecture
§ The Lambda Architecture § Can discard batch views and real-time views and recreate everything from
scratch § Mistakes corrected via re-computation § Data storage layer optimized independently from query resolution layer § Still in a very early …. But a very interesting idea!
- Today a zoo of technologies are needed => Operations won‘t like it
§ The technology/implementation § Different query language for batch and real time § An abstraction over batch and speed layer needed
- Cascading and Trident are already similar § Not everything works out-of-the-box and together § Industry standards needed!
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
61
2013 © Trivadis
Questions and answers ...
2013 © Trivadis
BASEL BERN BRUGG LAUSANNE ZUERICH DUESSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA
Guido Schmutz Technology Manager
24.02.2014 Event-Processing und Big Data kombiniert, geht das?
INFOBOX – Read and delete • There are two versions of the last slide
available, one for the contact details of a speaker, and one for two or more speakers.
• Name, title and location always underneath one another in one row (Shift+Return)
• This idea is that this is the last slide (also for questions and answers) and is on the screen for a long time at the end of the presentation, so the viewers have the chance to write down the contact data J
62